foote.pub

Jonathan Foote, Security Dad

Follow Me On

GitHub Open source code

Twitter Mostly retweets and non-sequitors

 

Posts

Stowing distracting MacOS apps (personal edition)

Stowing distracting Android apps

Other

git Line Level History Browser


git blame prints information regarding line-level changes to a file including author, time, and commit information.

$ git blame trollbox.py | head -n 3
496f919d (Jonathan Foote 2014-11-28 11:12:01 -0500   1) #!/usr/bin/env python
496f919d (Jonathan Foote 2014-11-28 11:12:01 -0500   2) 
496f919d (Jonathan Foote 2014-11-28 11:12:01 -0500   3) import sys

git log -L was initially developed as a 2010 Google Summer of Code project designed to do this (ref):

Generally, the goal of this project is to:
1. 'git log -L' to trace multiple ranges from multiple files;
2. move/copy detect when we reach the end of some lines(where lines
are added from scratch).

And now, we have supports in detail:
1. 'git log -L' can trace multiple ranges from multiple files;
2. we support the same syntax with 'git blame' '-L' options;
3. we integrate the 'git log -L' with '--graph' options with
parent-rewriting to make the history looks better and clear;
4. move/copy detect is in its half way. We get a nearly workable
version of it, and now it is in a phrase of refactor, so in the scope
of GSoC, move/copy detect only partly complete.

Since then the logic has reached the mainline and is now available in recent versions of git. From git log --help:

-L <start>,<end>:<file>, -L :<regex>:<file>
    Trace the evolution of the line range given by "<start>,<end>" (or the funcname regex
    <regex>) within the <file>. You may not give any pathspec limiters. This is currently
    limited to a walk starting from a single revision, i.e., you may only give zero or one
    positive revision arguments. You can specify this option more than once.

    <start> and <end> can take one of these forms:

    o   number

        If <start> or <end> is a number, it specifies an absolute line number (lines count
        from 1).

    o   /regex/

        This form will use the first line matching the given POSIX regex. If <start> is a
        regex, it will search from the end of the previous -L range, if any, otherwise from
        the start of file. If <start> is "^/regex/", it will search from the start of file.
        If <end> is a regex, it will search starting at the line given by <start>.

    o   +offset or -offset

        This is only valid for <end> and will specify a number of lines before or after the
        line given by <start>.

    If ":<regex>" is given in place of <start> and <end>, it denotes the range from the
    first funcname line that matches <regex>, up to the next funcname line. ":<regex>"
    searches from the end of the previous -L range, if any, otherwise from the start of
    file. "^:<regex>" searches from the start of file.

The function-level tracking feature seems pretty cool. It could be lack of experience with regular expressions outside of PCRE and Python, but it seems like this feature uses some one-off logic before handing the input string to the regex parser. I googled to find the test files for -L and goofed around a bit. It looks like only lines that start with non-whitespace in a target code file are considered.

This works:

$ git log -L :class:trollbox.py
commit f339b3ce300280042d03d74be47097c470219179
[...snip...]
@@ -13,151 +13,154 @@
 class MainWindow(QMainWindow):
[...snip...]

And this doesn’t:

$ git log -L ':    def download:trollbox.py'
fatal: -L parameter '    def download' starting at line 1: no match

Examples are from trollbox.

John Firebaugh wrote a good summary of these and other techniques in 2012. Some of the features no longer exist, but the article remains a good roundup. There is a useful tip on using vim-fugitive’s :Gblame for manual analysis in the comments section.

Update 2014-01-04

The logic that determines whether or not a line is a function appears to reside here: line-range.c line 128