Hi,
We were recently working on a Scraping and Extraction project from real-estate websites to extract and retrieve specific attributes for another client and would be interested in your project. We have worked on scraping, crawling, extraction, aggregation and synchronization for data consisentcy from various unstructured data, websites and have assembled it in useful way in Excel, CSV formats storing them into databases and synchronizing any updates to the website with the schema. We have also extracted information from Groupon, Wikipedia, Youtube and other product based web sites for very specific attributes and have used primarily PHP and Perl and a bit of Scrapy framework.
Please find below our short experience summary.
* Have several years experience developing Text Mining and Information Extraction and Analytics for web crawling, scraping, extraction and aggregation from unstructured big data such as web-pages and text corpus, assembling and populating them into databases, datastores and search-indexes(Lucene, Solr) for analysis, search, reporting and dashboard.
* Extensive experience using Perl, PHP, Python, C, Java, .NET with MySql, Oracle, MS-SQL Server
* Information Extraction Tools : Scrapy, Weka, R, Excel, Perl-CPAN Packages for Extraction.
Estimated Budget : ~ 745 USD ( Timeline : 7-10 days )
Price,milestones and timelines flexible and negotiable based on exact project specifications and details or for any additional project work.