570062 Custom Bot to Scrape Email from CSV (Desktop Version)
N/A
Completed
Posted almost 12 years ago
N/A
Paid on delivery
Read all below to fully understand project...
I need a custom script/bot that I can run on Windows 7 as needed to do the following task. I am not interested in learning a new language to pull this off but if a little work is required on my end to make it happen, then I'll do it.
Here we go..
Lets say I have an CSV sheet with the following columns:
Company
Contact
Address
Phone
URL
Email
Is there a way to have the bot look at the URL (which I can always make column A), scrape the website for an email address and post the results into the Email column (which will be blank originally)?
If there is more than one email address, it can be separated with a comma or space.
Since all emails (regardless if there is 1 or 100 results will be in one cell (if opened in Excel), the Email field can always be column B. This will keep the program from having to find where to put it as sometimes the data will take up more than the 4-5 columns.
I am using HMA so I can rotate my proxies as often as I need to. The program does not need that feature built in.
Multi-threading will be good.
The program should save the results into a new CSV file (in case something becomes corrupt).
The scrapper only needs to scrape within the domain/website. I am OK with setting a max or at least having the option to maximize the number of pages scrapped or time.
I've heard that this will be easy with uBot. As long as I can run it from my PC with little extra work, that is fine.
Serious bidders only. I do care about your past references but more importantly, I want to know that you understand the scope of work and can do it.
EXAMPLE
Here are some domains the way they will be listed in the CSV file. The first URL below them is the actual domain as it will appear in the CSV file. The second domain is the actual location where the scrapper will find the email address.
Since the URL in the CSV may not have an email address on the landing page, the bot will need to crawl all pages within the domain.
[login to view URL] = [login to view URL]
[login to view URL] = [login to view URL]
[login to view URL] = [login to view URL]
[login to view URL] = [login to view URL]
---- NOTE ----- NOTE ----- NOTE ----- NOTE ----- NOTE ----- NOTE -----
To be considered, reply with how you see this working so that I can make sure we are on the same page.
I'm ready to start... are you?
PS - Opportunity for future work as long as relationship and project goes well.
Thanks!