All data was manually entered into an Excel file by hand, starting with the year 1843 and then preceding chronologically. In making the database, the project included as many categories as possible, basing many on the 1843 dataset, since this was by far the most expansive. The following shows the total data categories in the database: name of factory, year of data, state, location, owner, factory type, spindles operating, spindles under construction, spindles inactive, total spindles, looms, hours of work, weekly wage, weekly weight of cotton (quintales), purchase price of cotton (Spanish dollars), weekly mantas produced (pieces), power and name changed. Further information on each data category can be found in the data fields section of this website.
As indicated by the data sources page, not all sources had the same data categories nor were they organized in the same manner. The coding of data began with the 1843 set. Where no data existed, cells were left completely blank. New categories of spindles inactive and name changed were incorporated with the 1844 and 1845 data, since this listed spindles inactive, instead of the spindles under constructed as listed in the 1843 dataset. A total spindles categories was also created to add the spindles inactive, operating, and under construction, intended to be utilized for comparisons across years. Once data was coded for 1843, data for 1844, 1845, and 1857 was then entered, but slight misalignments in the names of factories, owners, and locations of factories existed. These inconsistencies needed to be resolved to make the dataset understandable for those scholars comparing data across years. Since the 1843 data was the most comprehensive and descriptive, small variations of 1844, 1845, and 1857 data, such as these different spelling of names, were made to fit to the 1843 data. However, adjustments were very slight at most, only changing a few letters or the accent of a name. For example, a factory called “Molino de Teja” in Puebla in the 1857 data was changed to align with the same factory, but named “Molino de La Teja” from the 1843 data. All changes in the data are noted in the “name change” category, so that researchers can return to the sources and see any inconsistencies. Any names that misaligned for more than a few letters or spelling were kept as in the original source and changes occurred only if the factory name, location, and ownership aligned. In addition, any reference to “Don,” “Señor,” or “D.” was eliminated from the names of factory owners to make the dataset more legible.
One ongoing issue with the data is in the “factory type” category, which combines both the type of factory from the 1843-45 data and that from the 1857. In 1843-45, factories are listed as either spinning, weaving, or both. In 1857, factories are listed as either cotton or wool. More research and comparisons among the data is being undertaken to attempt to clarify how this category can be more specific, or if this category needs to be eliminated from the database.
Random data checks also occurred in order to ensure the integrity of the data between sources and the information entered. Once the data entry was completed, an excel file was uploaded to the database using WPtable, a plugin from WordPress.
For the 1850-1854 excel file, all data was entered exactly as listed in the sources, and no changes were made to numbers or the spelling of names.