Self tracking your browser history!
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
Pascal van Kooten 8fbda36c40
Merge pull request #12 from Ryuno-Ki/options
2 years ago
analyzer changed .nostalgia to nostalgia_data 3 years ago
boot_as_service misc fixes 3 years ago
chromePlugin Update content.js and background.js values based on preferences 2 years ago
nostalgia_chrome docker support 3 years ago
tests rename to nostalgia_chrome 3 years ago
.gitignore initial commit 4 years ago
Dockerfile docker support 3 years ago
README.md Modify the command to test out 3 years ago
chromePlugin.crx rename to nostalgia_chrome 3 years ago
chromePlugin.pem rename to nostalgia_chrome 3 years ago
classes.py rename to nostalgia_chrome 3 years ago
deploy.py rename to nostalgia_chrome 3 years ago
docker-compose.yml docker support 3 years ago
examples rename to nostalgia_chrome 3 years ago
setup.cfg readme improvements 4 years ago
setup.py make _main callable as well 3 years ago

README.md

nostalgia_chrome

Cross-platform Chrome History Analysis

PyPI PyPI

Self tracking

There is a movement of self tracking. Monitoring pulse, heartbeat and so on. But the most important is not being tracked: our online behavior.

Making sure we can self document, we need the following things.

  1. Chrome only keeps its history for a max of 90 days, so we need to start saving history.

  2. We need to collect HTML data from the pages we visit.

  3. We need to extract and analyze data from the HTML, such as code snippets, links, microdata, images, events.. anything really. This is done in Nostalgia Core.

  4. Allow plugins (and make them configurable, please contribute). The first example is that it will additionally track which videos you watch.

What can you expect (Data overview)

In ~/nostalgia_data/meta.jsonl an index will be saved per visit:

{
  "path":"/home/pascal/nostalgia_data/html/1576317113.7_httpsgithubcomnostalgiadevnostalgia_chrome.html.gz",
  "url": "https://github.com/nostalgia-dev/nostalgia_chrome",
  "time":"1576317113.75019"
}

In ~/nostalgia_data/html the source HTML will be stored as .html.gz (reaching about 8x compression).

In ~/nostalgia_data/videos_watched.jsonl the data for events on HTML5 video elements will be stored (on stop playing/close tab):

{
  "playingSince": 1576273573.08,
  "seekTime": 0,
  "playingUntil": 1576273599.977,
  "duration": 26.8970000744,
  "totalClipDuration": 3510.301,
  "pageLoadTime": 1576266470.316,
  "loc": "https://www.youtube.com/watch?v=Zz-bhLjVS5o",
  "title": "Lost Frequencies | Tomorrowland Mainstage 2019 (Full Set) - YouTube",
  "likes": 24137,
  "dislikes": 946
}

Installation

  1. Clone this repository: git clone git@github.com:nostalgia-dev/nostalgia_chrome.git

  2. In Chrome click the settings button and click "More tools" and navigate to "Extensions". Click "Load unpacked". Navigate to the chromePlugin folder and click "Open".

  3. pip install nostalgia_chrome

  4. To test it out, run nostalgia_chrome run_server. This will run the web server in the foreground so you can see that it works.

  5. Visit a (non-file / localhost) URL so that you can verify it works. The data will be stored in ~/nostalgia_data/meta.jsonl, ~/nostalgia_data/html.

  6. To make sure nostalgia_chrome gets automatically run on boot:

On Linux (systemctl based):

pip install sysdm
sysdm create "nostalgia_chrome run_server" --extensions ""

On Windows awaiting contribution for how to do this https://github.com/nostalgia-dev/nostalgia_chrome/issues/2

On OSX awaiting contribution for how to do this https://github.com/nostalgia-dev/nostalgia_chrome/issues/1