Project summary/goal. We are processing domains in bulk for spam, we check tens of thousands of domains per day. We want to quickly find which domains have spam sites linking to them by checking the incoming link anchors.
When we have that data, we want to check the anchors against a keyword blacklist and remove the domains that have anchors that contain keywords from the blacklist.
We want to create a web script that will take a list of domains and send them to the [login to view URL] API for processing.
API Docs: [login to view URL]
We want to send the domains and pull back the anchors associated for each domain. Specifically this API call - [login to view URL]
Once we have the data back, we want the script to take a list of keywords we provide and filter the anchors to see which keywords.
We will want to use a simple web interface, where we can provide a list of domains and a set of blacklist keywords. The script will send the domains for processing and then process the results into 2 csv files.
Output 1: A csv file with all non blacklist keyword domains and anchors in 2 columns
Output 2: A csv file with all blacklisted keyword domains and anchors in 2 columns
The interface would have a project list where we can create a project and start it, and also a settings page where we can update the blacklist keywords.
We can use a web template like this: [login to view URL] - we’ve used it extensively in the past already.
17 freelancers are bidding on average $527 for this job
I am a professional web data scraper specialized using Python program, PHP script, .Net program, Crawler and Bot. My tool can search data and get information from Aa to Zz with an existing lists of english words.