Script to extract m3u8 file from URL

tubbadu@lemmy.kde.social to Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ@lemmy.dbzer0.com – 36 points –

Hello! I'd like to write a script to download videos from streamingcommunity.estate from a given video URL, and to do this I need the m3u8 file url. Currently I manually go to the network tab to search for it, but I'd like the script to do this automatically. Do you know of a way to achieve this? Bash or Python if possible, otherwise any other method will do fine. Thanks in advance!

12

youtube-dl does something similar (it works on a lot more than just youtube). AFAIK each site needs a slightly altered code. You could have a look at the source on their github.

It might be as easy as forking the project and creating your own extractor.py.

I actually use yd-dlp to download m3u8 playlists, but what I'm looking for is a way to extract the m3u8 file URL, so that I can give it to ytdlp to download the actual video

I'll look into the extractor docs, seems interesting! Thanks!

I sure hope they commit the work back into the main repo...

I doubt they would be able to. I was interested and looked around a bit. There's even a whole chapter in their documentation dedicated to adding extractors. First paragraph of that chapter is basically ''do not add piracy websites''

There is firefox addon that i just found which does this. It's called: Extract video link. Maybe look at source code.

How did u go op? I tried to get stream video and failed in my methods. 😥

hi, sorry for the late reply! I finally wrote this nodejs script:

const puppeteer = require('puppeteer');

// This is where we'll put the code to get around the tests.



function findPlaylistUrl(networkUrls) {
  for (const url of networkUrls) {
    if (url.startsWith('https://vixcloud.co/playlist')) {
      return url;
    }
  }
  return ''; // Return an empty string if no matching URL is found
}

(async () => {
  // Check if URL argument is provided
  if (process.argv.length <= 2) {
    console.error('Usage: node get_network_urls.js ');
    process.exit(1);
  }

  const url = process.argv[2];

  // Launch a headless browser
  const browser = await puppeteer.launch({ headless: 'true' });
  const page = await browser.newPage();

  // Enable request interception
  await page.setRequestInterception(true);

  // Capture network requests
  const networkUrls = [];
  page.on('request', (request) => {
    networkUrls.push(request.url());
    request.continue();
  });

  // Navigate to the URL
  await page.goto(url);

  // Wait for a while to capture network requests (adjust as needed)
  await page.waitForTimeout(5000);

  // Print the captured network URLs
  console.log(findPlaylistUrl(networkUrls));
    
  // Close the browser
  await browser.close();
})();

the first argument passed to the script is the url of the webpage. The script uses the puppeteer module to "fake" a browser, in order to receive all the network calls and so on, and then will search through them for the m3u8 playlist. It is very specific and only works on this website, but it can be easily adapted for other websites as well

Thanks heaps for reply. Legend. After using chatgpt and bingchat I got the script cleaned up and working but unfortunately no output. 😥 I dont need the content, I just like to defeat these websites. Oh well. Dont want to bother you. My command was "node your-script.js https://streamingcommunity.express/watch/3330"

uhm that's strange, I just tried executing it on your link and it worked. have you waited at least 5 seconds after running the script?

Thanks for reply, appreciate it. The thing is when i try your code it errors out. So I had to use bingchat or chatgpt to clean it up. Any chance you could upload your file somewhere please?