Dwayne Winseck October 30, 2012
At the CMCR Project we open our data sets to anyone who wants to use them for free and without restrictions because there’s more insights buried in them than we will ever discover on our own. We think ‘real world’ events such as the current efforts by Bell to consummate its take-over of Astral Media, despite the CRTC’s recent decision to stop the deal dead in its tracks, also demands that such data be made as widely available as possible.
We also want you to help us discover any errors that might exist in our data sets. All bugs are shallow, as coders like to say, when a thousand pairs of eyes are looking for them. When mistakes are discovered, we will acknowledge them, fix them, thank you, and move on.
All of this said, however, our aim is to minimize the occurrence of mistakes. We are striving to develop the best practices possible in terms of data management, internal review of data and graphics before they go to post, and so forth. As I have said many times, building a systematic, long-term and comprehensive data set covering over a dozen TMI sectors from 1984 to the present is not easy. Managing all of the data that emerges from this exercise is not easy, either.
I have carried much of that responsibility on my own up until now. Beginning with our next data release, however, two graduate student members of the CMCR Project team – Adeel Khamisa and Lianrui Jia – will be reviewing all new data releases, and whatever charts, tables, graphs and figures go along with them, before we post them. We will also be tightening up our protocols for managing our data. Adeel will take the lead on data administration, while Lianrui’s keen eye for details and familiarity with the processes of data collection for this project put her in a good position to review presentations of our data before they go out the door.
Adeel and Lianrui have already proven their skills in such things and an ability to work under incredible pressure and tight timelines, and I’m happy to rely on them even more than I have. Of course, the final responsibility for the project rests with me.
Finally, we have always said that this project is driven by an ‘open source/open data’ mentality, but we also want to open up each new release of data to an external review period of two weeks. Of course, you can review and comment on our stuff any time. But in terms of a better-defined external review period, we will notify interested parties whenever a new data set is open for review. Please let us know if you would like to be on the list when we make such announcements.
Ultimately, having our data scrutinized and verified by multiple resources should only improve the results even further. Long after the spotlight on Bell’s ongoing efforts to take-over Astral Media dims we will have returned to toil in relative academic obscurity, methodically, slowly, steadily building a long-term body of evidence for others to draw on whenever the need arises.
This is a long-term project that is formally slated to run until 2016, and which will likely outlive even that date. It is essential that we take steps to improve our data whenever the need to do so is merited, and to do so earlier rather than later. This has been our commitment from day one of the CMCR Project; it is what we are doing today; it is what we will continue to do for as long as the project exists.