Opinion on public data

I find it a good practice to share your research data along with the publication, but many things may prevent you from doing so, for example, self interest, self doubt and government data that requires some forms signed. I bet self interest is the number 1 reason why people don't publish their research data so willingly, as a way to prevent others from “stealing” future research opportunities from you. I find it rather foolish to think this way, especially in this day and age.

Data are everywhere. No good research is prevented due to the lack of data, with many exceptions of course. Your data is just one manifestation of a general underlying process, which may be documented by someone other than you. The value of your data diminishes the moment you hold tight to it. Keeping on to the data only tell others two things: 1) your inability to produce further work from it, and 2) your instinct that further work is still possible.

I try not to let my research be tied down to any particular data set. Any data should be fine, as long as they document the same process, illustrate the same idea. When a data set becomes or remains private, my usual response is “Good, I am not interested.” Non-public data are full of errors and obscurity, suffer from the lack of documentation or careless documentation. The last time I used a non-public data set, 90% of my time and brain power was wasted on cleaning up the ****. Since it is non-public and I can’t upload a corrected version, endless more hours will be wasted in the future. That tells you how I feel about the non-public data. If I don’t have to look at them, I won’t. Generally speaking, the quality of public data sets are much better, and I recommend you to wean your reliance on non-public data set at early as possible.

I am sharing the portion of data that I am allowed to share. I have to admit that many of the data sets that I have worked on are definitely not public. Many of them I don’t own, but those I do own, you can find them here. They are hosted on figshare, which gives me back a unique identifier for each data set I publish there for free. In other cases, I am serving as a custodian of some data sets for others. Those data that do not have a downloadable link requires a formal request email to me, and I will forward the request to the owner and send you the data.

Just a friendly reminder:

Each day, 80k acres of forests are disappearing ...

So think about that when you try to print something next time.
      
        - - -
        -        -  -     --    -
     -                 -         -  -
                     -
                    -                --
    -          -            -              -
    -            '-,        -               -
    -              'b      *
     -              '$    #-                --
    -    -           $:   #:               -
  --      -  --      *#  @):        -   - -
               -     :@,@):   ,-**:'   -
   -      -,         :@@*: --**'      -   -
            '#o-    -:(@'-@*"'  -
    -  -       'bq,--:,@@*'   ,*      -  -
               ,p$q8,:@)'  -p*'      -
        -     '  - '@@Pp@@*'    -  -
         -  - --    Y7'.'     -  -
                   :@):.
                  .:@:'.
                .::(@:.      -Sam Blumenstein-