Find Jobs
Hire Freelancers

Scrape a website & insert into database & perform some tasks with the information

$250-750 CAD

Closed
Posted almost 8 years ago

$250-750 CAD

Paid on delivery
I need someone to write some software that will archive every listing posted on a particular website and use that information as described in the features section of this post. Basic logic of program: 1. Send a request to a website that returns listings in xml format 2. Check each listing against a mysql database 3. Send a web request to each new listing individually to get all the information 4. Features 1,2,3 (Explained in detail below) 5. Upload images from the listings to amazon S3 6. Add the information for each listing to a mysql database 7. Sleep before looping back to step 1 (Read feature 4) Limitations: The website is limited to a 20 listings at a time (Step 1). If all new listings are found, keep sending web requests for the next page of listings until previous listings are found, so no listings are missed. (During peak times it is possible for more than 20 listings to be posted between the minimum sleep period of 2 minutes) Features: 1. Create a table that tracks listings that are from the same user (by using two values found in the listing). Keep a tally of how many listings that user has posted and a tally of how many of those listings are unique (I suggest this is done on a separate thread as to not slow down the scraping). 2. If enabled, check each new listing's price against comparable listings on another website (web request to an api), and calculate the average value for comparable listings using the archive of listings in my database. Use some math calculations to decide if the listing is undervalued by a configurable amount/percent and send an alert (Amazon SNS and database entry). (This must be done on a separate thread as to not slow down the scraping) 3. Check each listing against search criteria, which can be configured by adding rows of criteria to a mysql database, and send an alert (Amazon SNS and database entry) if a new listing satisfies that criteria. (This will be simple criteria, such as if the listings price is >100, or if the listing is a specific model, etc). (This must be done on a separate thread as to not slow down the scraping) 4. Adjust the sleep time automatically as to minimize the amount of pages requested before finding previous listings (Explained in limitations). With a minimum sleep time of 2 minutes, a maximum of 15 minutes from 7AM - 11PM, and a maximum of 2 hours from 11PM-7AM, before looping. 5. Once daily check each active listing in the database against the website to see if the listing has been updated, or if the listing has been deleted. If it has been updated, save the changes to the database as a new row. If it has been deleted, change the status in the database so the listing will not be checked again. (I suggest this be a separate script ran by a cron job). Requirements: 1. Must run on a linux server 2. Error Handling (Website down, website responds with unexpected data, etc) 3. Log activity/errors in a text file. Send an alert if errors occur (Amazon SNS and entry into database) Program can be coded in any language that can run on a linux vps and take advantage of the multiple ip addresses the server has. PHP would be preferred.
Project ID: 10186611

About the project

11 proposals
Remote project
Active 8 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
11 freelancers are bidding on average $386 CAD for this job
User Avatar
Hi, I have read the description & would like to discuss.. I have good web scraping experience & reviews. & can develop web scraping scripts in Python & C# Hope we can discuss details..
$250 CAD in 3 days
5.0 (149 reviews)
6.8
6.8
User Avatar
We are a team (19 operator and 2 Quality checker)here from last 4 year giving all research service world wide with best quality output , I have gone through your project description, It is really a interesting job, and our operator are experienced enough in research skill so they easily can collect the data from several source, from a deep investigation, but its bit time consuming job not a copy paste. We would like to talk in details and give the total structure about how we ll do this job if you need. LETS TALK HERE FOR DUSCUSING THE JOB Thanks Dg
$250 CAD in 10 days
4.7 (221 reviews)
7.1
7.1
User Avatar
I have reviewed your bid request and I am very interested in your project. I was trained overseas and have an extensive customer service record so contact me so we can discuss further or begin. I work in milestones and the "payment for time" option. If payment is by deliverables, then the milestones are 50% payment once the initial work/draft is done and the remaining can be paid if/when revisions are needed and completed. Bonuses welcomed and much appreciated. I've done many jobs on freelancer.com and hope for many with you and if nothing else add me to your coder list and notify me of your future jobs. Thanks.
$261 CAD in 7 days
2.5 (1 review)
1.7
1.7
User Avatar
I have great expertise in web scraping in PHP. I have built up a personal library that lets me accomplish every request easily. I can handle sessions, proxies and avoid anti-scraping controls.
$250 CAD in 3 days
0.0 (0 reviews)
0.0
0.0
User Avatar
I am New to Freelancer. But i have been working with a company and was working really good i have made few apps and done like more than 1K data entry projects and i have typing speed almost 95 WPM and can assure to complete your work and provide you with the best i can .
$277 CAD in 10 days
0.0 (0 reviews)
0.0
0.0
User Avatar
A proposal has not yet been provided
$555 CAD in 10 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of BANGLADESH
Bangladesh
0.0
0
Member since Apr 11, 2016

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.