Analyse and refine a CSV file

Closed Posted Nov 14, 2010 Paid on delivery
Closed Paid on delivery

Analyse and refine a large CSV file so that the end output is a collection of Excel files that are meaningful and manageable.

An application is required to perform the above on any CSV file that follows a defined structure.

## Deliverables

Take any CSV file with 16 columns and approx 1,000,000 rows (see attached).

The columns are as follows...

a) SPORTS_ID

b) EVENT_ID

c) SETTLED_DATE

d) FULL_DESCRIPTION

e) SCHEDULED_OFF

f) EVENT

g) DT ACTUAL_OFF

h) SELECTION_ID

i) SELECTION

j) ODDS

k) NUMBER_BETS

l) VOLUME_MATCHED

m) LATEST_TAKEN

n) FIRST_TAKEN

o) WIN_FLAG

p) IN_PLAY (IP - In-Play, PE - Pre-Event, NI - Event did not go in-play)

Remove all rows where SPORTS_ID (column A) is not equal 1 (i.e. > 1). This should leave you with 609005 rows including header.

Sort the file alphabetically by EVENT (column F) and save a file for each of the following distinct Events (any other Events should be ignored):

1) 1st Goal - 5812 rows including header

2) 2nd Goal - 2699 rows including header

3) 3rd Goal - 2063 rows including header

4) Correct Score - 131893 rows including header

5) Draw No Bet - 4019 rows including header

6) Half Time - 27926 rows including header (see attached)

7) Half Time Score - 48057 rows including header

8) Half Time Full Time - 17799 rows including header

9) Match Odds - 100891 rows including header

10) Next Goal - 6192 rows including header

11) Odd Or Even - 184 rows including header

12) Over One And A Half Goals - 36448 rows including header

13) Over Two And A Half Goals - 57457 rows including header

14) Over Three And A Half Goals - 31816 rows including header

15) Penalty Taken - 653 rows including header

16) Sending Off - 1999 rows including header

Take the saved Half Time file (#6) saved above and sort alphabetically by FULL_DESCRIPTION (column D). Extract all those rows that begin 'English Soccer/Barclays Premier League' and save as a new file.

The new file ([login to view URL]) will include 1591 rows including header (see attached).

For each game order by SELECTION, LATEST_TAKEN (Oldest to Newest) and FIRST_TAKEN (Oldest to Newest).

For each of the 3 possibilities in a game extract the row whose LATEST_TAKEN value is closest to the DT ACTUAL_OFF value and has an IN_PLAY value of PE. Taking Arsenal vs West Ham as an example the extracted rows would be as follows:

SELECTION ODDS NUMBER_BETS VOLUME_MATCHED LATEST_TAKEN FIRST_TAKEN WIN_FLAG IN_PLAY

Arsenal (HT) 1.55 2 386.56 30/10/2010 14:00 30/10/2010 14:00 0 PE

The Draw (HT) 3.5 11 1165.66 30/10/2010 13:59 30/10/2010 13:36 1 PE

West Ham (HT) 12.5 5 31.2 30/10/2010 13:55 30/10/2010 13:03 0 PE

Save a new file ([login to view URL]) where we only have 3 rows as above per match.

For each of the 3 possibilities in a game extract the row whose ODDS value are less than 2.01. This should return zero or 1 rows per game. In the current example the row where the odds value are 1.55 will be extracted. Save a new file ([login to view URL]) where we have no more than 1 row as above per match. See attached for how files should be refined.

There may be further work required on some other files saved above if this project proves successful.

Engineering Microsoft Project Management Script Install Shell Script Software Architecture Software Testing Windows Desktop

Project ID: #3854575

About the project

29 proposals Remote project Active Nov 25, 2010

29 freelancers are bidding on average $159 for this job

RaiseSolutions

See private message.

$1564 USD in 14 days
(25 Reviews)
7.5
hwanghendra

See private message.

$72.25 USD in 14 days
(414 Reviews)
7.2
danieladacruz

See private message.

$63.75 USD in 14 days
(128 Reviews)
6.4
vishaygupta

See private message.

$85 USD in 14 days
(51 Reviews)
6.3
FreeDevelopers

See private message.

$424.15 USD in 14 days
(43 Reviews)
6.3
quickprogexpert

See private message.

$34 USD in 14 days
(156 Reviews)
6.4
tomky

See private message.

$34 USD in 14 days
(99 Reviews)
6.0
vnb400sl

See private message.

$68 USD in 14 days
(109 Reviews)
6.0
webtechpk

See private message.

$84.15 USD in 14 days
(134 Reviews)
5.9
jal540

See private message.

$148.75 USD in 14 days
(37 Reviews)
5.7
DenisSmolentsev

See private message.

$63.75 USD in 14 days
(74 Reviews)
5.0
LGLSoftware

See private message.

$102 USD in 14 days
(67 Reviews)
5.2
alexanderlim

See private message.

$84.15 USD in 14 days
(37 Reviews)
4.5
daufenbach

See private message.

$212.5 USD in 14 days
(32 Reviews)
4.5
quaintek

See private message.

$102 USD in 14 days
(23 Reviews)
4.2
dumbraveanualin

See private message.

$102 USD in 14 days
(19 Reviews)
4.0
CYBOPOB

See private message.

$255 USD in 14 days
(7 Reviews)
3.9
sunny05tt

See private message.

$42.5 USD in 14 days
(21 Reviews)
3.8
senior84

See private message.

$51 USD in 14 days
(5 Reviews)
3.4
rcafrompoland

See private message.

$25.5 USD in 14 days
(14 Reviews)
3.2