Saturday, February 20, 2016

How to Search with Git



Git offers several powerful options to search for commits or changes.

Find Files by Content Pattern: Git Grep

To search for a specific pattern in repository files, you can use the Git grep command.

git grep pattern


Pattern can be a regular expression.

By default Git grep searches in the current directory but you can pass a branch name or directory to search instead. Git grep has two advantages over the shell's grep command. First, it only searches tracked files so you get more relevant results. Secondly, the shell's grep command requires that you have all the source files you need to search locally. On the other had, Git grep can search any known branch whether it's checked out or not and whether it's local or remote.

git grep pattern remote/branch

Here we search a remote branch that doesn't exist in the local repository for a pattern.


Find Commits by Message Pattern: Git Log Grep

To find commits by a pattern in the message added with the commit, not the diff, use the --grep option for the git log command.

git log --grep='Fixing bug 103'



Find Commits by Diff: Pickaxe

To find commits using the actual diff the commit added you can use the -S option for the log command, which is known as pickaxe. Note there's no space after the -S.

git log -Skeyword
git log -Smethod()



Find Commit by Line: Blame

To find the commit id for each line in a file use the git blame command. This command will show the commit that added each line in the file. This is useful when trying to figure out which commit introduced a change.

git blame filename



Find Commit by Test: Bisect

Bisect is a binary search mode Git offers to find a commit that introduced a bug or problem by allowing the user to examine commits and specify whether the commit is good or bad until, using binary search, Git narrows down the search and finds the single commit needed. You are free to use whatever test needed to determine whether the commit you have is good or bad. This is useful when we are searching by functionality or behavior not a pattern.

Git needs to know the last known good commit to have a starting point for searching. Git also needs a bad commit which is by default HEAD.

git bisect start
git bisect bad
git bisect good HEAD~20

In the commands above, we tell Git to start bisect search. Then we specify HEAD as a bad commit and the commit 20 changes back as the last known good commit.

Now Git will start checking out commits and asking whether the commit is good or bad.

git bisect bad
git bisect bad
...
git bisect good
git bisect good
git bisect bad



Now bisect search ends and Git tells us exactly which commit introduced the first bad change.


Dangling Commits: Git Fsck

Finally, the commit we need might be dangling and thus not searchable by the above methods. A commit is considered dangling if it is not connected to the repo tree, meaning there is no path of changes that a user can use to see the commit or checkout its changes.

A commit that is disconnected from other objects is referenced by the reflog for a configurable period with a default of 90 days before it becomes dangling. After the reflog expiration for disconnected objects, the commit is garbage collected and becomes dangling.

To find dangling commits use the Git fsck command. If there are any dangling commits, they will be shown.

git fsck

If you find a dangling commit, you can see its content using the Git show command.

git show commit-hash-id

To see the diff:

git show -p commit-hash-id


References

  • http://amazon.com/Version-Control-Git-collaborative-development/dp/1449316387
  • https://git-scm.com/book/en/v2/Git-Tools-Searching