Scrapebox Everything Else

If I reformat my hard drive or get a new computer what do I do?

Simply re-download Scrapebox and reinstall scrapebox.  When you load the program simply click the Activate button, just like you did the first time.  Then fill out the info that is required and check the transfer license button, and hit submit.

(You can re-download scrapebox here: http://www.scrapebox.com/payment-received)

How do I transfer my scrapebox liscense to a new PC?

You are permitted to transfer your ScrapeBox license to another PC once per month for free in case you get a new PC, re-install Windows etc.  For complete instructions on how to do this go here: http://www.scrapebox.com/scrapebox-license-transfer

Does scrapebox have and affiliate program?

No, not at this time.

How do I filter adult urls or content in scrapebox?

Load the list of urls you want to filter in the Urls harvested section.  Then go to remove/filter >> remove urls containing entries from.

Then it will ask you to load a text file.  Take the below terms and any other you want to add and put them in a text file and save them.  Then load the list when prompted by scrapebox, after clicking remove urls containing entries from.

Of course some of these terms could filter out good urls as well, such as "bride" for instance, but this is the best alternative for filtering adult urls.

List:

sex
porn
pron
szex
xxx
x-live
x-video
xvideo
hentai
erotic
chick
tit
boob
slut
anal
poker
babe
blonde
brunette
russian
bride
fuk
redhead
penis
dick
blowjob
oral
gay
lesbian
pussy
vagina
gangbang
bondage
adult
teen
girl
woman
dirty
fuck
ass
bitch
shit
butt

Pinging your links to get them indexed

If you want to Ping your links to get them indexed you need to use the RSS ping function, which is labeled simply RSS in the commenter section of scrapebox. The option labeled PING is for inflating page views and won't get your urls indexed.

RSS Ping is an XML-RPC spec http://www.xmlrpc.com

So the way to do it is import the file that contains the urls you want to get indexed,  into the harvester grid, go to Export URL List >> Export as RSS XML List. Then scan the URL’s which fetches the link Title and Descriptions, set how many entries in each feed and export. It saves as an .xml file(s) which then needs to be uploaded to your domain and will look like: http://www.scrapeboxfaq.com/feed.xml

Then select RSS in the commenter section. Load the RSS Services and feed URL’s to ping them.  There are default RSS services that come with scrapebox or you can use your own.  The feed urls are the ones you uploaded to your domain, like above.

Scrape emails from Craigslist

You can grab emails with the email grabber in the harvested urls section. It will let you harvest emails from a url or a local file.

Say you wanted to harvest emails from the Jobs category on Craigslist.

In a regular web browser open up Craigslist. Find the category you want to harvest from, in the case of the jobs category, most major cities it looks like this:

http://losangeles.craigslist.org/jjj/

I got this by selecting the city I wanted, and then clicking the "jobs" link at the top of the category.

Then you would copy down that url, which is what is above.  Note: make sure that if it gives you a spam warning you follow thru to get the actual url of the page that lists the ads.

If you like you can also copy down the urls of the "Next 100 results".

Then save off all of the urls from the categories you want.

Then import them into the Link Extractor addon.

Choose Internal only.

Then let it harvest all the urls from those pages.  This will give you all the current craigslist ads for each category from all the pages you choose.

Then export the results to a txt file.

Then import that txt file into the urls harvester section.

Then use the email grabber to get the emails from those urls.  Thus you have scraped all the emails from Craigslist for the current ads from the categories you have chosen.

The best part is the category urls are static, but the urls that you harvest from them change daily, so you can repeat this process over and over.

What does the delay apply to?

The delay option in the comment poster section lets you set a delay in seconds. If the RND is chosen that it pulls a random value from the adjust RND delay range under settings.

The delay only works with:
Single Threaded Harvester
Ping Mode
Page Rank Checker
Email Grabber

Will Scrapebox work with TOR?

While I have not personally attempted to torrify scrapebox to see if it will work, I have looked at it and I am fairly confident it won't work.  Even if it did work, due to many limitations of TOR, when considering how scrapebox works, it wouldn't work well.  Simply scraping public proxies would work better.

Does scrapebox work on WINE?

No Scrapebox will Not work on WINE.  WINE lacks many of the needed APIs and other elements that scrapebox needs to work.

Why does the remove subdomains from URLs function not work for some domains?

With the new ability to register generic tlds it has gotten a bit confusing.  So Scrapebox uses a database to remove subdomains from urls.

However sometimes you will have a list and most of it seems to work, but you will be left with stuff like

something.blogspot.com

and wonder why that subdomain wasn't removed.  The answer is that is not a subdomain thats a domain because blogspot.com is an actual tld.  So it would seem.com is the tld, but its not its blogspot.com So

car.something.blogspot.com

is a subdomain but

something.blogspot.com is a regular domain just like car.com is a regular domain.  You can view the complete list here:

https://publicsuffix.org/list/effective_tld_names.dat