Create a Python script that standardizes scraped data from existing scripts before they're saved to a database
$30-250 USD
In Progress
Posted about 7 years ago
$30-250 USD
Paid on delivery
I have 10 different scraping scripts that I run through my VPS that each captures data from a website and stores that data to a database.
There is a slight problem, as the data that is being captured is inconsistent, and I want to display the data in a consistent format in my database and website. You must create a Python script that will 'standardize' the data.
FOR EXAMPLE: One of the fields that is captured on each website is 'Manufacturer'.
Website 1: 'Manufacturer' = {GE, TOSH, WST}
---> {General Electric, Toshiba, Westinghouse}
Website 2: 'Manufacturer' = {Westing., Toshiba, General elec.}
---> {Westinghouse, Toshiba, General Electric}
I want to insert a script within each of the scraping scripts that accomplishes this. Some of the filters will require Regular Expressions, so your script should be set up to be able to handle that.
** I can fill out the specifics of the arrays myself, for which words should be substituted for the terms. I just need someone with Python knowledge to construct the script and the 'template arrays' and tell me where to place it within my scripts. **
I will provide you with a sample of one of the scripts. They run Scrapy, and they are all similar enough that you will probably be able to create just one script and it will work for all of my scrapers.
The budget for this project is $50.
Hi sir,
This is kimi and I am scraping expert, I have did too many scraping projects, please check my profile page then you will know.
https://www.freelancer.com/u/mantislin.html
Can you tell me more details? then I will provide example data/script for you.
Thanks,
Kimi
Hello,
I have a lot of experience with web scraping and a lot of scripting experience in Python.
I would love to help you with this.
Please contact me for more information.
Hello Sir, How are you ?
I read your description and I see that you have the array ready, if that's the case, then let's just start working on the project! please contact me, and thank you!
A Python and web scrapping developer here ready to discuss this further and create this script (regex) to standardize this scraped data. Could you send me the sample script so I can understand better what you exactly want done?