Thursday, May 31, 2018

Vim Window Sizing


There are few Window commands in Vim I really find useful but sometimes hard to remember regarding changing the size:
  • Maximize current horizontal window: Ctrl-w _
  • Maximize current vertical window: Ctrl-w |
  • Change the height of the current window:
    • Ctrl-w +
    • Ctrl-w - 
    • :resize 5
      • Replace 5 with the desired positive or negative number.
  • Change the width of the current window:
    • Ctrl-w >
    • Ctrl-w <
    • :vertical resize -5 
      • Replace -5 with the desired positive or negative number.
  • Close other windows: :only
    • The buffers remain open.
  • Convert the current window to a tab: Ctrl-w T
    • With this command you lose the original window.
  • Open current buffer in a tab: :tabe %
    • This is a good option to temporarily maximize (or zoom) the current window and then quit the tab to go back to all windows in the state you left them. It does not close the original window like Ctrl-w T.

Monday, January 2, 2017

Learn Vim with vimtutor


There's a builtin tutorial that comes with Vim which walks the user through seven lessons covering the basics of using Vim. This is a really nice way to get started with Vim.

You can start it with the following command:
vimtutor

It's a good idea to save the file right after opening it so that you can make whatever changes and come back to it again.
Saving the file is useful if you want to stop in the middle and come back to the tutorial later.

You can save using:
:w tutorial

This will save text into a file called tutorial in your current directory.

To see your current directory, use the pwd command inside Vim.
:pwd

Other Resources for Learning Vim


Finally, if you have Vim version 8 or later try the :smile command.



                            oooo$$$$$$$$$$$$oooo
                        oo$$$$$$$$$$$$$$$$$$$$$$$$o
                     oo$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$o         o$   $$ o$
     o $ oo        o$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$o       $$ $$ $$o$
  oo $ $ "$      o$$$$$$$$$    $$$$$$$$$$$$$    $$$$$$$$$o       $$$o$$o$
  "$$$$$$o$     o$$$$$$$$$      $$$$$$$$$$$      $$$$$$$$$$o    $$$$$$$$
    $$$$$$$    $$$$$$$$$$$      $$$$$$$$$$$      $$$$$$$$$$$$$$$$$$$$$$$
    $$$$$$$$$$$$$$$$$$$$$$$    $$$$$$$$$$$$$    $$$$$$$$$$$$$$  """$$$
     "$$$""""$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$     "$$$
      $$$   o$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$     "$$$o
     o$$"   $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$       $$$o
     $$$    $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$" "$$$$$$ooooo$$$$o
    o$$$oooo$$$$$  $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$   o$$$$$$$$$$$$$$$$$
    $$$$$$$$"$$$$   $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$     $$$$""""""""
   """"       $$$$    "$$$$$$$$$$$$$$$$$$$$$$$$$$$$"      o$$$
              "$$$o     """$$$$$$$$$$$$$$$$$$"$$"         $$$
                $$$o          "$$""$$$$$$""""           o$$$
                 $$$$o                                o$$$"
                  "$$$$o      o$$$$$$o"$$$$o        o$$$$
                    "$$$$$oo     ""$$$$o$$$$$o   o$$$$""
                       ""$$$$$oooo  "$$$o$$$$$$$$$"""
                          ""$$$$$$$oo $$$$$$$$$$
                                  """"$$$$$$$$$$$
                                      $$$$$$$$$$$$
                                       $$$$$$$$$$"
                                        "$$$""""
 

Saturday, December 31, 2016

How to Ace the Software Engineer Interview


I recently had to go through the whole interview process which gave me an opportunity to look at interviewing from the interviewee perspective after being on the other side of the table for a long time.
I wanted to put down my learnings as a reference for myself and friends who occasionally ask for interview advice.
When it comes to the details of some technical topics, I will reference good resources instead of duplicate them.

The software engineering interview will vary depending on the company but will generally be composed of 4-5 whiteboarding sessions about an hour each. The bar for the technical whiteboarding interviews will depend on how competitive/popular the company is.

This post will focus on the general software development interview which may differ a bit from specialized interviews for specific roles like machine learning, computer vision, or other specializations. Note that more junior interviews will be lighter on design.

There are two parts related to the technical interview that I want to initially get out of the way: the behavioral interview and technical screen.



Behavioral / Leadership
"The best predictor of future behavior is past behavior."

The behavioral part of the software engineering interview varies a lot more than the technical part depending on the company. Some companies won't even bother with it.

The idea behind a behavioral interview is to probe into the candidate's past experience to find examples that exhibit desirable qualities. Most companies will use the STAR methodology to find examples. Some companies have a known list of qualities they look for. For example, Amazon is famous for its Leadership Principles. It's also possible to get direct questions around team dynamics and handling difficult situations.

Examples of difficult situations you maybe asked about include: tight timelines, ambiguous requirements, customers, and feedback.

A few things to keep in mind:
  • Have a story or something to say ready for everything you list on your resume.
  • Have examples ready of challenges you have handled in the past. To make it worthwhile, it should show some pressure or constraints.



Tech Screen
"Seek to first to understand, then to be understood." - Covey

Most big companies will do a phone screen before the onsite interview unless there's an internal referral or some other justification to skip the technical screen. The technical screen is a company's first filter to validate the candidate can write more than a resume or an online profile. The tech screen can be done in person or more commonly over the phone. Recently, companies are employing online exams for this purpose like the following which can also be used for practice:

The tech screen will generally focus on one or two coding problems that require problem solving. For all practical purposes the phone screen is just another technical whiteboarding session and the best way to prepare for it is the same as the onsite interview.



So how do you ace the coding interview?
I think the coding interview skills can be broken into a few categories: problem solving, CS theory, programming language, familiarity with interview questions, and design.

Problem Solving
"The formulation of the problem is often more essential than its solution, which may be merely a matter of mathematical or experimental skill." - Einstein

This might be the hardest area to prepare for. Your best friend here is practice. Aside from practice there are some general patterns to notice and strategies to use. The most common element to solving coding problems and to problem solving in general is to approach the problem from a counter-intuitive angle or using a method not immediately apparent. The following patterns and strategies should help trigger such thoughts.

  • Go through a couple simple instances of the problem by hand. This helps verify your understanding of the problem as well as allow you to notice any patterns that emerge from solutions.

  • Iteration. When the problem involves iteration, consider different ways to iterate. One of the most common variations is to iterate using two indexes instead of one usually using different paces. As you can see this is somewhat unexpected. When generally working with collections, you use a single index or iterator so this forces you think outside the box a bit. Other variations involve iterating in different directions.
  • Determine if a linked list has a loop.
  • Determine if a linked list has a loop of a specific size.
  • Array pairs with a given sum.

  • Go through different data structures and algorithms you know and see if any or a combination fits the requirement.
    • Min-Stack problem.

  • Combine multiple operations. We usually get fixated on thinking in terms of a single operation at a time. In some problems it helps to consider various operations even the reverse of the operation you eventually need. For example using subtraction when the problem is asking about summing. Another version of this is to think through other problems that have some resemblance and see if you can combine solutions or ideas from other problems.
    • Rolling sum.
    • Variations of popular problems like variations on the Edit Distance problem.

  • Preprocessing. In many problems the input data is not immediately useful for the output. In these instances, think about how would you want the input data to be to make it easier to get the results.
    • Find all anagrams in a list of words.

The key is to entertain thoughts for different possibilities and see how far you can go with them. Solving puzzles and practicing interview questions helps sharpen your problem solving skills. More on interview questions is below. Here are some other ways to practice:



Programming Language
"A most important, but also most elusive, aspect of any tool is its influence on the habits of those who train themselves in its use. If the tool is a programming language this influence is, whether we like it or not, an influence on our thinking habits." - Dijkstra

You need to have mastery of at least one of the common programming languages: Java, Python, C/C++, C#, and JavaScript. Most companies will be open to hiring engineers of any programming language background as long you show strength in the language you use. This is because most software engineering concepts can be carried from one language to another and a strong candidate will not have a problem in picking up a new language.

In terms of interviewing though, I have found that it is much easier to use the most high level language possible during the interview. A language like Python makes the interview experience much more pleasant because the expressive power of the language allows you to focus on solving the problem instead of worrying about low level primitives that are easy to mess up especially when writing on a whiteboard or without an IDE.

The details for a lot of the information mentioned here varies from language to language. Whatever language you choose, make sure you understand its unique features when it comes to OOP, concurrency, design patterns, and conventions.


A few things to keep in mind for writing code in any language:

"Programs must be written for people to read, and only incidentally for machines to execute." - Abelson

  • The code should be as maintainable and readable as possible. Avoid hacks.
  • Use self-explanatory names.
  • Think about how you can test the code.
  • Break functionality into multiple small functions if needed, each with a single responsibility.
  • Use the best practices of the language.
  • Aim for self-documenting code but add comments when needed.

CodeCademy is an excellent resource for learning most of the common languages.


Computer Science Theory
"If you find that you're spending almost all your time on theory, start turning some attention to practical things; it will improve your theories. If you find that you're spending almost all your time on practice, start turning some attention to theoretical things; it will improve your practice." - Knuth

Computer Science theory or fundamentals has many components. I will highlight here the major categories and their sections relevant to interview questions.


Time Complexity
"Time is what keeps everything from happening at once." - Cummings


Online courses:


Data Structures
"Bad programmers worry about the code. Good programmers worry about data structures and their relationships."  - Torvalds


Other resources:

Online courses focused on data structures:


Algorithms
"When I consider what people generally want in calculating, I found that it always is a number." - al-Khwarizmi



Online courses focused on algorithms:


Courses for data structures and algorithms:


Other DS and Algorithms resources:


Other Topics
"People who are more than casually interested in computers should have at least some idea of what the underlying hardware is like. Otherwise the programs they write will be pretty weird." - Knuth




Interview Questions
"The key to wisdom is knowing all the right questions." - Simone Sr.

After seeing many interview questions, you will realize that most questions are variations of other questions.
Solving as many interview questions as possible builds your familiarity with topics and various solution approaches which become part of your problem solving toolkit.
You can build this familiarity by using online resources or books on this exact topic.

Of the online resources I found the following to be the most useful:

There are also books written specifically on this:

Most companies will automatically reject a candidate that pretends to not have seen a question he knows. The point is not about memorizing answers. It's about being familiar with many different problems and solutions. Not only is the familiarity useful for coming up with a solution to a new problem, it also builds up your confidence so you are not intimidated when seeing a new question because chances are you already have seen something like it before.



Design
"Form follows function." - Sullivan

Most entry level software development interviews will not include design questions. The expectation is typically that if the position requires some years of experience that the candidate should have design competencies.
Unlike other topics covered here what is meant by asking a design question is not commonly agreed upon. I've seen different engineers ask about totally different competencies when coming up with design questions. Nevertheless, I think most design questions fall into one of the following groups: Object Oriented Design, Design Patterns, and System Design.


Object Oriented Programming
"Implementation inheritance causes the same intertwining and brittleness that have been observed when goto statements are overused. As a result, OO systems often suffer from complexity and lack of reuse." - Ousterhout

In OOP, there are some basic definitions that everyone should know like Encapsulation, Inheritance, and Polymorphism.
There is also a design aspect to OOP commonly called Object Oriented Design. In some interviews you may get questions specifically on OO design which is different from system design which will be discussed below.

In OO design, classes represent entities/objects, member variables represent attributes/data, and methods represent behaviors/functions. The most important aspect of OO design is being able to factor our common functionality in the right super class and knowing when to extend/inherit or contain another class.

Some things to understand:
  • Encapsulation/data hiding, inheritance, and polymorphism. You should be able to provide examples of each.
  • Super/parent class vs. derived/child/sub-class.
  • Difference between a function and a method.
  • Inheritance vs. composition. What are the pros and cons?
  • Class vs. abstract class vs. interface.
  • Pros and cons of multiple class inheritance. Why are the implications of a language supporting it?
  • Overriding vs. overloading.
  • C++: virtual functions, pure virtual functions, and friend functions.

OOP becomes very powerful when abstracting responsibilities into different layers so that at each layer you are only concerned with a small number of easy problems. Then multiple layers coalesce to provide a complex functionality. OOP also makes it easier to connect various objects via composition. This is ideal, for example, in trees for referencing children.

Sample questions:
  • Design a web browser.
  • Design a library management system.
  • Write an interpreter that evaluates a statement according to a certain grammar. This could be a mathematical formula or any other type of formal language.

Resources:



Design Patterns
"Each pattern describes a problem that occurs over and over again in our environment, and then describes the core of the solution to that problem in such a way that you can use this solution a million times over, without ever doing it the same way twice." - Alexander

Design patterns are very useful to have in your toolkit in general and they come up sometimes in interviews in the context of a design question.
There is some overlap in naming and responsibilities of design patterns which sometimes complicates things. You should be able to at least know when a problem makes a perfect usecase for a certain pattern.

The following is a list of common patterns.
  • Abstract factory: for creating families of related objects.
  • Builder: for creating complex objects or when building an object needs to be done in steps.
  • Factory method: using a (static) method for object creation instead of the constructor. Effective Java makes recommendation for this pattern.
  • Lazy initialization: delay creation until first use.
  • Prototype: create a new object by cloning a prototype object.
  • Singleton: only allow a single object of the class to be created.

  • Adapter/Wrapper: adapt the interface of class to a different interface.
  • Decorator/Wrapper: add functionality while keeping the same interface allowing layering of multiple decorators on top of each other.
  • Proxy: also implemented as a wrapper but intent is control access to the target object while maintaining the same interface.
  • Bridge: in the bridge pattern you have two hierarchies of classes: abstraction and implementation. The implementation classes provide primitive functionalities while the abstraction classes provide high-level functionalities that build upon the methods of the implementation classes. The bridge class then behaves as the link between the two hierarchies.
  • Composite: treat a group of related objects as a single object by using recursive composition.
  • Façade: provide a common easier interface to a group of low-level interfaces.
  • Flyweight: share stateless objects to avoid object instantiation overhead.

  • Chain of responsibility: have a chain of objects eligible that either punt to the next object in the chain or handle the request halting the traversal.
  • Command: encapsulate a request as an object which allows queueing, logging, and undoing operations.
  • Interpreter: evaluate a language by encapsulating its grammar.
  • Iterator: allow traversal while hiding the internal representation of the collection.
  • Mediator: encapsulate the interaction of multiple classes in a mediator thus allowing loose coupling among the classes involved.
  • Memento: encapsulate snapshotting of the state of an object.
  • Observer/PubSub: implement one-to-many relationships as publisher and a group of subscribers.
  • State: alternate the behavior of an object by encapsulating state in an object and delegating requests to it. This helps avoid long conditionals or switch statements.
  • Strategy: encapsulate a group of algorithms to make it easier to alternate the algorithm in the client.
  • Template method: define a skeleton of an algorithm and let subclasses fill in the details.
  • Visitor: group similar functionality on different objects in one class while allowing double-dispatch.


For more see:


System Design
"There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies." - Hoare

In more senior interviews, there is an element of system design. Below are some topics that are important for system design.

System design questions usually revolve around scaling or distributed systems. The tricky part in massively scaled systems is partitioning load and data. In distributed systems, the tricky part is about detecting and handling partial failures and balancing consistency, availability, and network failures.

  • Data partitioning. A scalable system is one that avoids hot spots. The core idea in eliminating hotspots is to shard the data. There are a few techniques to do this. The challenge in data sharding is how to dynamically re-shard to reduce hot spots.
    • Range partitioning: Divide the data based on ranges of an identifier and assign each range to a server. The main issue with this method is that it requires a numeric id that is evenly distributed over known ranges.
    • Hash partitioning: Apply a hash function on the data to produce a number then assign data to a group of servers using the modulo operator. The key disadvantage with this method is that changing the number of nodes requires remapping the entire data.
    • Distributed hash tables: A two-step hash table where the first step maps data to its assigned node then the node stores the data according to the data element's key.
  • CAP theorem: Consistency, availability, and network partitions. Choose two.
  • Recovery Oriented Computing (ROC).
  • Redundancy. The key is to create "backups" of the data to provide fault tolerance but we need to do this without down time and we also need to be able to fail over from faulty nodes without down time so data replication is used. One of the core issues to solve in data replication is the management of node membership.



Books
"Always read something that will make you look good if you die in the middle of it." - O'Rourke

If you're like me and like information in book form, you can find most of the topics outlined here in the following books:

Computer Science Fundamentals:

Design:



Competitive Programming
"With competition everyone has to try harder." - Greene

I felt that this post wouldn't be complete without mentioning competitive programming since there are so many options now. The best thing about these websites is that they are computer graded or in other words have an online judge.

Solving these challenges or competitions is great for programming practice and sometimes even wining prizes.




Final Thoughts
"Festina lente."

The art of programming is hard to master so like anything hard don't expect it come quick. It's achievable but it takes deliberate effort and practice. Simply put, you can't be a great software engineer in 21 days.

A final tip is to approach openings through a recruiter or even better an internal referral instead of through the regular jobs/careers website for the company you are interested in. Even when looking at a job posting online, you can usually find out who posted it and try to find his/her contact info.

Saturday, February 20, 2016

How to Search with Git



Git offers several powerful options to search for commits or changes.

Find Files by Content Pattern: Git Grep

To search for a specific pattern in repository files, you can use the Git grep command.

git grep pattern


Pattern can be a regular expression.

By default Git grep searches in the current directory but you can pass a branch name or directory to search instead. Git grep has two advantages over the shell's grep command. First, it only searches tracked files so you get more relevant results. Secondly, the shell's grep command requires that you have all the source files you need to search locally. On the other had, Git grep can search any known branch whether it's checked out or not and whether it's local or remote.

git grep pattern remote/branch

Here we search a remote branch that doesn't exist in the local repository for a pattern.


Find Commits by Message Pattern: Git Log Grep

To find commits by a pattern in the message added with the commit, not the diff, use the --grep option for the git log command.

git log --grep='Fixing bug 103'



Find Commits by Diff: Pickaxe

To find commits using the actual diff the commit added you can use the -S option for the log command, which is known as pickaxe. Note there's no space after the -S.

git log -Skeyword
git log -Smethod()



Find Commit by Line: Blame

To find the commit id for each line in a file use the git blame command. This command will show the commit that added each line in the file. This is useful when trying to figure out which commit introduced a change.

git blame filename



Find Commit by Test: Bisect

Bisect is a binary search mode Git offers to find a commit that introduced a bug or problem by allowing the user to examine commits and specify whether the commit is good or bad until, using binary search, Git narrows down the search and finds the single commit needed. You are free to use whatever test needed to determine whether the commit you have is good or bad. This is useful when we are searching by functionality or behavior not a pattern.

Git needs to know the last known good commit to have a starting point for searching. Git also needs a bad commit which is by default HEAD.

git bisect start
git bisect bad
git bisect good HEAD~20

In the commands above, we tell Git to start bisect search. Then we specify HEAD as a bad commit and the commit 20 changes back as the last known good commit.

Now Git will start checking out commits and asking whether the commit is good or bad.

git bisect bad
git bisect bad
...
git bisect good
git bisect good
git bisect bad



Now bisect search ends and Git tells us exactly which commit introduced the first bad change.


Dangling Commits: Git Fsck

Finally, the commit we need might be dangling and thus not searchable by the above methods. A commit is considered dangling if it is not connected to the repo tree, meaning there is no path of changes that a user can use to see the commit or checkout its changes.

A commit that is disconnected from other objects is referenced by the reflog for a configurable period with a default of 90 days before it becomes dangling. After the reflog expiration for disconnected objects, the commit is garbage collected and becomes dangling.

To find dangling commits use the Git fsck command. If there are any dangling commits, they will be shown.

git fsck

If you find a dangling commit, you can see its content using the Git show command.

git show commit-hash-id

To see the diff:

git show -p commit-hash-id


References

  • http://amazon.com/Version-Control-Git-collaborative-development/dp/1449316387
  • https://git-scm.com/book/en/v2/Git-Tools-Searching