Project Description:
Types to Groups Program
The purpose of this program is to aggregate the frequencies of different types into group frequencies. This program read in two file (Groups.txt and Types.txt) and outputs a third (Output.txt). The file Types.txt contains frequencies of individual words, lemmas, or polywords. The file Groups.txt contains the headword and all the inflected and derived forms whose frequencies need to be summed together. The Output.txt file contains the headword and the sum of the inflected and derived forms. Note that the data is case sensitive. The data can be separated with commas or tabs for easy import into Excel.
For example, Goups.txt (or .csv) will look like this:
a,an,A,An,A,AN
about,abouts,About,Abouts,ABOUT,ABOUTS
ad,ads
afraid
age,ages,aged,aging,ageing,ageings
all
already
always
any
anything,anythings
as,As,AS
at,At,AT
at least
attack,attacks,attacked,attacking,attacking
Types.txt (or .csv) will look like this:
a 9
about 10
ads 1
afraid 1
age 1
All 1
all 1
already 2
always 1
an 1
any 1
anything 2
Anything 1
at least 2
Output.txt (or .csv) will look like this:
a 10
about 10
ad 1
afraid 1
age 1
all 2
already 2
always 1
any 1
anything 3
at least 2