Thoughts on Building a Lifetime Personal Data Collection App

Challenges and Goals

Creating an application designed to be a central repository to store and display our life’s collection of personal data is a monumental task. Having to factor in the ability to integrate external and cloud based data sources, that will most likely change over time, in addition to our local data makes it even that more complex. Ensuring that the application and our data will persist beyond our lifetime and into the foreseeable future is yet even harder to imagine.

I’ve seen several digital legacy based companies emerge that provide ways to preserve some aspects of the memories of people who have died. These companies are limited in scope usually to photos and a few other data types but I’m glad to see them bring attention to and provide this need. Entrusting our personal data to a single company to manage it requires immense trust and hope that the company will be around for a long time. Eventually future custodians of our data will need to maintain our data for us and in turn move it to other locations as necessary. I believe that the best way we will be able to achieve creating such a monumental application is for it to begin as an open source project.

How Do We Do It?

An application built for the purpose of housing our life’s personal data both stored locally and in the cloud needs to be extremely flexible and built in a way that can be customized with individual components to accommodate the multitude of data types and the ways a person would want to explore it. I believe that building a comprehensive software package for creating a personal data legacy could be quite difficult and everyone’s needs will be different. This type of software should have a very similar architecture to an open source content management system such as WordPress or Drupal in that it could have some core functionality that would then be supplemented by plug-ins or modules for specific features.

One example would be the integration of photographs. This would require one set of custom plugins designed to provide the ability to integrate both local data as well as cloud based services such as Google, Flickr, Instagram, etc). Then a separate set of plugins could provide the display layer of that data (such as galleries) as well as added functionality such as search or location based features. Plugins would be available in a similar way for other data types such as documents, emails, video, audio files etc.

Then in much the same way that Automattic is a professional services company for the WordPress open source project, and Acquia is the professional services company for the Drupal open source project, an ecosystem of for profit companies could be built around this application to mitigate any concerns around a single company owning such a critical application and ensuring its future support and evolution.

This Software Was Almost Realized

A few years ago it appeared that we were going to have such an application developed with the arrival of The Locker Project. I was monitoring this open source project in the early days and it looked very promising. One of the first areas they tackled was creating data connectors both to import and publish data to and from the application. This proved to be a fairly time consuming process having to develop these connectors that communicated differently with so many web services and their API’s. This part of the project was being developed by a for profit company called Singly which also was the corporate support partner for Locker Project. The Locker Project seemed to be making some great strides helped greatly by Singly’s services. It was a great solution for other companies by offering them a great reduction to development time and costs allowing to rely on Singly for API connections to so many 3rd party services for them.

Unfortunately with the success of Singly it seemed like much of the momentum of The Locker Project began to languish and eventually Singly was acquired by another company and The Locker Project development eventually stopped. In fact in writing this post I discovered that apparently the domain wasn’t renewed and has now been taken over by a squatter. Here’s the last snapshot of the site I was able to find on Archive.org.

I had interacted a few times with developers from the Locker Project with hopes if it being revived but nothing materialized. Here’s the last interaction I had with them

locker_project_twitter
more conversations are available here

Homebrew Method

I’ve spent a considerable amount of time coming up with processes and cobbling together various pieces of software to try and create a comprehensive way to collect, backup and display all of my personal digital data. I’ve created a guide that provides three crucial steps to provide a roadmap for others as well. I keep searching for ways to improve this process along with the ways I catalog and review my data. My hope is to eventually have a single centralized application to manage most of this for me and eventually provide a simple way to both allow me to reminisce my own memories and provide an easy way for family and future generations to view it long after I’m gone.

Moving Forward

I’d love to see a stronger focus and more of a discussion around the topic of preserving and referencing personal data storage, analytics. I feel that most people aren’t thinking about how they will pass along their personal data that continues to grow in detail and size as so many aspects of our life go digital. I came up with a base plan that is useful to get started and this is something we should all be thinking about. If you have written or are aware of anything related to this topic in the way of software or methods to achieve some of what I’ve discussed please share them in the comments.

This Post Has 10 Comments

  1. Thank you for your post, few people spent time or other resources on this topic. I guess one must be a-long-term-thinker to do this. For myself, I focus on the ‘collection’ instead of the ‘app’ side of this. Long-term continuïty is hard to achieve with an app that requieres maintenance and develepment. Therefore I store everything as *.txt, *.pdf or *.jpeg and my collection is just folder/file based.
    I do spent a fair amount on getting file-timestamps right. So I can display my files chronologically.

  2. Thanks for the feedback Joel. I similarly organize my data into a “collection” such as yourself for my local data but trying to find or discover information based on manually navigation a folder structure isn’t ideal. My data collection also includes data imports from external apps and services that contain important data that I want incorporated as well such as my location history from Google as an example.

  3. Hello Mark
    Thank you very much for this post ! It’s a brillant summary of this interesting topic.
    For many years reading your blog posts, I have been more and more focus on digital preservation and digital legacy. I have tried to develop in PHP a personal project inspired by The Locker Project.
    Today, this application automatically backup everything I created or favorited on digital plateform, social networks etc…
    Here is the list :
    YOUTUBE
    VIMEO
    DAILYMOTION
    INA
    SOUNDCLOUD
    MIXCLOUD
    FLICKR
    INSTAGRAM
    PINTEREST
    DELICIOUS
    SCOOPIT
    INSTAPAPER
    FOURSQUARE
    SLIDESHARE
    FACEBOOK
    TWITTER
    RUNKEEPER
    NARRATIVE
    UBER
    WITHINGS
    SPOTIFY

    Through API of all these services, I backup and download everything possible. For instance, with video platform like youtube or vimeo, I automatically download all media files. Same thing with photos on Flickr, instagram. For links, my application make a screen capture of the page. On Runkeeper, I keep all data information about my running session. With withings I backup all my activities : steps, sleep, weight etc…

    If I tweet, put a status on Facebook, favorite a Slideshare, a song on Spotify or Soundcloud, or pin an image on Pinterest, everything is automatically backup.

    It’s like Gyroscope regarding data saving. I’m really impress by this service but I’m not confortable that an another application “owns” my personal data.

    I have backup more than 10000 different items (representing 300 GB of assets) and the new step is to “vizualize” easily all these data.

    I’m totally agree with you regarding the Open Source approach of these kind of service. Also I’m thinking that this kind of application would be on an autonomous platform at home like a Raspberry Pi. So I’m thinking about Open Source my code, but I don’t know how to do it : I’m not a “professional” developer and I’m pretty sure that my PHP code sucks 😉
    Maybe the solution would be to develop based on a platform like WordPress or Drupal… I don’t know.

    Also this application is used to backup all important media of my family.

    Thanks again for your posts, it’s a great source of inspiration !! Don’t stop. 🙂

    Renaud (@renalid)

  4. Thanks for being a reader Renaud. If you want to consider submitting your software as an open source project you can learn more about the process here: https://opensource.com/resources Maybe you can write a blog post that discusses how your software works too which others might find helpful.

  5. Hi Mark, I’ve recently saw your blog about SmallWorlds and it’s lifestreaming features. Do you still play the game? It is super fun, and there are so many cool, new updates in the game now. I joined in April, 2012.

  6. Soooo is the point of this article that we need a system that enables us to create a backup of your public/private digital life and/or is the point to have analytics on top them? If so what would your rank as order of priority to back up said data? Health / Social / Pictures ? Would you want to then put it into a private network such aka Path like?
    Hugs G

  7. Hi Geoffrey! So I don’t think there is a single “point” with regards to the need for us to backup and archive our data. While I think everyone should do it, how that data will be accessed and used could be different for everyone. I haven’t thought about a ranking of each type of data but backup everything anyways. I think it would be great to create a flexible permissions based model for accessing the data that could define what’s accessible based on roles. For example, public, friends, and family would each have varying access levels.

  8. Hi Mark,
    Long time follower and fan! I get the permission based part. I think more importantly since there are so many data points we can create each day what would be the most important facebook data, pintrest posts, twitter, fitbit , tracking data, Lets say top 10? I never realized how handsome man you are.

Do you have a comment?

Close Menu
%d bloggers like this: