Edit large files on Linux

cool_penguin_smallOne of our readers requested a list of editors on Linux capable of editing huge files in the order of GBs. In one of our earlier articles we explored some commands to create huge files on Linux. We also visited glogg, a log viewer with similar capabilities but it cannot edit files. Here’s a list of some robust editors.

1. lfhex

A Qt based GUI editor. Can view and edit files in hex, octal, binary, or ascii text mode. Can work with files much larger than system RAM or even address space.

Features

  • Low memory usage
  • Instant load times
  • Instant save times
  • Infinite undo/redo
  • Dynamic hex/octal/binary/ascii editing mode
  • Search
  • “Goto” field for jumping to a specified offset (offset can be specified by a mathematical expression: 0xff*3
  • 64 bit offset support
  • Dynamic resize support
  • Conversion dialog
    > Linked to selection
    > Shows conversion to int, float, double, ascii, hex
    > Modifying int/float/double/ascii/hex updates all the other fields
    > Option to show/edit byteswapped values
  • Binary comparison dialog
    > Differences can be walked by “block”
    > A block can be from 1-16 bytes long
    > Starting offset can be different in each file
  • Minimal dependencies (just Qt)

Limitations

  • Does not support insertion/deletion (cannot change file size)
  • Search/compare can be slow (compared to cmp or any other non-paged IO app)
  • Cannot search files with unsaved modifications

To install on Ubuntu:

$ sudo apt-get install lfhex

2. Joe

Joe s a very powerful full-featured terminal editor. Written in C and the only dependency is libc.

Features

  • Can view and edit files in text of hex mode
  • Supports UTF8 characters
  • Multi-file search and replace- file list is either given on cmdline or by a UNIX command (grep/find) run from within JOE
  • Mouse support, including wheel (works best when using xterm). The mouse can resize windows, scroll windows, select and paste text, and select menu entries.
  • Context display on status line: allows you to see name of function cursor is in
  • Syntax highlighting
  • Swap file allows editing files larger than memory
  • Bash-like TAB completion and history for all prompts
  • Jump to matching delimiter
  • and many more…

Limitations

  • NO vertical windows
  • No folding
  • No background spell checking, like Microsoft WORD
  • Cannot highlight all matching words

To install on Ubuntu:

$ sudo apt-get install joe

3. HEd

HEd is a powerful hex editor with a hexdump -C like interface. It can load and edit infinitely large files.

Features

  • Very fast on very large files (keeps only necessary portion of the file in memory)
  • Fast inserting anywhere in the file
  • Fast saving of intermediate changes
  • vim-like controls (and exmode)
  • Powerful expressions concept for flexible searching and transformation operations on the file or a selected region

HEd is not available by default on Ubuntu. Download HEd v0.5 compiled on Ubuntu 14.04 amd64 here.
md5sum: 5eb449e5d613d5925c6ee50ea11ab317

4. LargeFile

This is a plugin available for vim that turns off certain vim features to handle large files. The g:LargeFile (by default 100) option describes the minimum size of a file to be considered as a LargeFile, in megabytes. This option can be set in ~/.vimrc as:

set LargeFile=50

Limitation

Note that LargeFile may not be able to handle a 1GB file as it doesn’t change the way vim opens a file.

Installation

Download the latest version from the homepage. Then:

$ vi LargeFile.vba.gz
:so %

Git commands for everyday use

git_logo_compGit is a very advanced and powerful version control system. It supports too many features and it’s difficult to remember each and every command unless you are a heavy Git user or a Git dev yourself. I try not to remember too many of them. However, Git is used best when it is used from the cmdline. And in my experience, there are a few commands that are used much more frequently than others. I will list those in this article to have a reference for myself as well as for others. In the examples I will use GitHub’s own Atom project repository link for the examples.

Before I begin, here’s a good practice for a Git dev – always have a clean master branch checked-out from the original project’s master branch. Make your changes to a separate development branch and rebase your master and dev branch regularly with the main project branch. In other words, your changes should go into the project’s master branch from your dev branch. Those changes are then pulled into your master branch.

Git installation and initial setup
  • Install Git on Ubuntu
    $ sudo apt-get install git
  • Set your name, email address and default editor
    $ git config --global user.name "My Name"
    $ git config --global user.email "your-email@provider.com"
    $ git config --global core.editor vim
Add a project
  • Use GitHub’s web interface to fork your own master branch from the project’s main branch. Clone your mater branch locally.
    $ mkdir atom
    $ cd atom
    $ git clone https://github.com/username/atom.git
  • In case of very old or large repositories, you can do a shallow clone by fetching history of only latest n commits.
    $ git clone --depth 2 https://github/username/atom.git
  • Add the original Atom project as upstream.
    $ git remote add upstream https://github.com/atom/atom.git
  • Use the GitHub web interface to create a new branch named dev. This is where your own changes should go first.
  • Check-out (or switch to) the dev branch
    $ git checkout -b dev
The powerful checkout command
  • Check-out (switch to) a branch, tag, revision or revert changes to a file
    $ git checkout branch/tag/revision/filename
  • Revert all local changes
    $ git checkout -- .
Get upstream changes to your master branch
  • Switch to the master branch
    $ git checkout master
  • Check details of origin/upstream
    $ git remote show origin
    $ git remote show upstream
  • Retrieve the latest changes from upstream
    $ git fetch upstream
  • Merge upstream into your master branch
    $ git merge upstream/master
Get changes from your master to dev branch

A few words of merge and rebase first. Rule of the thumb is – use merge/rebase when you have no changes in your local dev branch. Both result in same commit history. However, if you have changes in your local dev branch but would like to get the changes in remote as well, then use rebase for a clean commit history.

  • Checkout dev branch
    $ git checkout dev
  • Merge or rebase depending on the situation
    $ git merge master
    $ git rebase master
  • Push the changes to your dev branch origin
    $ git push origin dev
Branch deletion
  • Delete a local branch
    $ git branch -D branch_name
  • Push the deletion to remote
    $ git push origin --delete branch_name
Other common scenarios
  • Check the branch and local status (maybe after a merge/rebase conflict)
    $ git status
  • Check and local modifications to tracked files
    $ git diff
  • Set diffuse as the diff tool
    $ git config --global diff.tool vimdiff
    $ git config --global difftool.prompt false
    $ git config --global alias.d difftool

    To view the diff in diffuse

    $ git d
    $ git d commit#1 commit#2
    $ git diff HEAD^ HEAD
  • Check commit history
    $ git log --stat
  • Hide local changes and pull, then unhide local changes
    $ git stash
    $ git pull
    $ git stash pop
  • Commit changes locally
    $ git commit -a -s
  • Push the changes to origin
    $ git push origin dev
  • Undo a commit, make changes, redo it
    $ git commit ...
    $ git reset --soft HEAD^
    //edit files
    $ git add ...
    $ git commit -c ORIG_HEAD
  • Edit/amend the most recent commit message
    $ git commit --amend -m "New commit message"

    If the commit is already pushed to remote

    $ git push origin dev --force
  • Clean local changes
    $ git reset --hard HEAD
    $ git clean -f -d //cleans all untracked files as well
  • Reset unmerged files
    $ git reset filename
    $ git checkout filename
  • Accept all changes post merge conflict occurs
    $ git reset --hard origin/dev
    $ git merge -X theirs master
  • If you have lost the URL of the remote repo you forked
    // if referential integrity has been broken:
    $ git config --get remote.origin.url
    // if referential integrity is intact:
    $ git remote show origin
Tagging

Tags are baselines you can revert back to in the future.

  • Get and list tags
    $ git pull --tags
    $ git tag
  • Create a tag
    $ git tag -a tag_name -m "your message here"
    $ git push origin --tags
  • Delete a tag
    $ git tag -d tag_name
    $ git push origin :refs/tags/tag_name

Another useful list is the giteveryday man page.

Feel free to suggest if you think I should add any command to the article.

patool: extract archives with one tool

I remember a former colleague who would always confuse the switches for gzip and bzip2 to tar. That led to a lot of confusion among people when he sent packages to them. However, I do not blame him. Using compressed archive files on Linux needs you to remember a lot of switches. Though tar can detect the file format (gzip/bzip2) nowadays while extracting, how about a tool that could handle many more compression types? Continue reading patool: extract archives with one tool

Manipulate compressed files without extracting

compress_compgz or bz2 are very common compression formats on Linux. How do you read a gzip or bzip2 compressed text file? Or how can you list the files in an archive without extracting it? Fortunately the packages tar, gzip and bzip2 come with a set of handy utility programs which can do these for you. In this article we will take a peek into how those utilities work. Continue reading Manipulate compressed files without extracting

vim column selection

vim_compThere are many editors which support column mode selection. It comes handy in many situations e.g. when you want to remove the preceding line numbers before each line in a code snippet. To do this in vim:

  1. Open the file
  2. Press <v> to enter visual mode
  3. Press <Ctrl-v> to enter the block selection mode
  4. Use the arrow keys to select all the text you want
  5. Press <Del> to remove the selection, <y> to yank (copy), <c> to cut

To insert spaces or shift a block towards right by spaces:

  1. Press <Ctrl-v> to enter the block selection mode
  2. Select the first column of each line
  3. Press <Shift-i> for block insertion
  4. Insert space(s)
  5. <Esc> to normal mode

Enable extra compression formats on Ubuntu

compress_compWhile any Linux distro supports the standard Linux compression formats like gzip, bzip2 etc. out of the box today, most newbies wonder how to use formats like rar, 7z etc. on Linux. The good news is, while you need to install a third party app like 7zip or WinRAR on Windows, on Ubuntu everything just translates down to integration with the standard archive management tool – File Roller. It comes with the same flexibility and functionality that you get on Windows, like compression levels, spanned archives etc. Continue reading Enable extra compression formats on Ubuntu

Write access for all on new ext4 partition

diskIf you are creating a new ext4 volume (partition) on a hard disk and trying to write to it as any user other than root you might not be able to do it due to permission issues. While there are too many solutions spread around, here’s a simple way to do it from the GUI without touching /etc/fstab on Ubuntu:

  • Click on the disk icon in the launcher to open the new volume in Nautilus.
  • Check the mountpoint. Normally it will be under /media directory.
  • Make sure the volume is mounted.
  • Open another terminal and type sudo nautilus and provide you password to enter privileged mode.
  • Browse to the mountpoint and right click. Select Properties and go to Permissions tab.
  • Change the permissions for the volume as you wish and also for internal files.
  • Unmount the volume and mount again. You should be able to write to the volume as a regular user now.

Random bash & vi tips

terminal

Here are some random bash and vim tips and tricks which might come handy every now and then:

  1. Instead of running multiple commands using sudo, issue any of the following once and then run all subsequent commands as root:
    $ su
    $ sudo bash
  2. To change the default base directory, add the following in ~/.bashrc:
    export CDPATH=/etc
  3. Run a shell from inside vi:
    :shell
  4. Run a command from within vi:
    :!ls
  5. Split vi vertically and open a new file:
    :vsplit newfile
  6. vsplit opens a new file in the left. To swap the panes use <Ctrl-w-r>.
  7. To highlight the current line in vim:
    :set cursorline
  8. To force vim to remember the last position in a file opened earlier, edit /etc/vim/vimrc and uncomment the 3 lines as shown in the snippet:
    " Uncomment the following to have Vim jump to the last position when
    " reopening a file
    if has("autocmd")
      au BufReadPost * if line("'\"") > 1 && line("'\"") <= line("$") | exe "normal! g'\"" | endif
    endif
    

    Users will need to logout and login back for this to take effect.

  9. Quick spelling suggestions/completions (case-insensitive):
    $ look spellin
    spelling
    spelling's
    spellins
  10. A smart way to remove all blank lines in a file:
    $ cat filename|awk NF
  11. If you have missed running a command with sudo and want to do so, run:
    $ sudo !!
  12. To add some colour to your bash prompt, uncomment the following in ~/.bashrc:
    #force_color_prompt=yes
  13. To paste in vi without auto-indentation:
    :set paste
    //to get back to normal mode
    :set nopoaste
  14. To count items in a directory:
    $ ls -1 | wc -l // count visible items
    $ ll -a | wc -l //count hidden files (includes . & ..)

    fish shell has a builtin function to do this:

    $ count (ls -1)
  15. Press <Ctrl-x-e> in the terminal and your default editor will open up.
  16. The ss command is similar to netstat. it can show more information on TCP and state.
  17. The tree command shows the current directory structure in a tree format. pstree does the same for processes.
  18. If you are looking for a restricted environment for users of your server, check out rbash.
  19. Lookup IP address and geographic info in bash:
    $ curl ipinfo.io/10.10.10.10
    OR
    $ wget -qO- ipinfo.io/10.10.10.10 | cat
    
    // To check your own public IP address
    $ curl ifconfig.me/ip
    OR
    $ wget -qO- ifconfig.me/ip | cat
    
    // For whois information of your IP
    $ whois $(wget -qO- ifconfig.me/ip | cat)
  20. To repeat the last colon command executed in vim press <@:> in command mode.
  21. To copy (yank) lines 26 to 41:
    :26,41y
  22. To indent, use >. To indent 10 lines: 10>> To visually mark a block of lines and indent: vjj> (v for visual mode, j to select one line and move to next, > to indent. To indent a block within curly braces, place cursor on one of the curly braces and use >%. To auto-indent text while coping and pasting a block, use ]p in place of the usual p.
  23. Keyboard shortcuts for people who are too lazy to type in when they want to exit vim: To save and exit from command mode: <Shift-z-z> To discard changes and exit: <Shift-z-q>
  24. To save a file opened as a regular user and save as root from vim: :w !sudo tee % Just reload the file when asked for.
  25. If backspace and delete keys don’t work in vim, add the following in ~/.vimrc:
    set backspace=indent,eol,start
  26. Quick stopwatch:
     $ time read

    Press <Ctrl-d> to stop.

  27. Clearer mount output:
    $ mount | column -t
    $ findmnt

Here are my ~/.vimrc contents, if interested:

set nu
set ai
set incsearch
set hlsearch
set ts=4
set shiftwidth=4
set ic
set cindent
set cursorline
set splitright

nmap  :TrinityToggleNERDTree
nmap  :TrinityToggleTagList

Run sudo without entering password

Ubuntu 12.04 (Precise) has taken every possible measure to disable root login from GUI. Though there is a workaround which I posted earlier I was reluctant to login as root after setting up my full environment as a different user. At the same time as a power user I hate providing password each time for running commands using sudo. I am using ext4 filesystem and I have found a solution to this – add the option nouser_xattr in your fstab for the root partition. For example, my fstab entry is:

UUID=015054d2-1052-4635-aca9-4ccdd87af914 /               ext4    noatime,
nodiratime,barrier=0,nobh,commit=20,nouser_xattr,errors=remount-ro 0       1

This will disable the prompt for password every time you run sudo. So you can write a script of commands which need root access (sudo), add the script in the Startup Applications and run it automatically every time after login.

Finally, another way to achieve the same result on any filesystem; add the following at the end of /etc/sudoers file as root:

username ALL=(ALL) NOPASSWD: ALL

ctags & cscope: the fastest IDE

vim_compIf you are a Linux developer, there is no alternative to the deadly combination of ctags and cscope when it comes to source code browsing and editing. I remember years back, out of mere curiosity I started learning them one evening and ended up in practicing throughout the night to get accustomed. I am a fan of ctags and cscope since then. They are much much lighter and faster than any other IDE I have laid my hands on.

ctags and cscope are the solutions to all your code browsing needs

Installation

To install on Ubuntu:

$ sudo apt-get install exuberant-ctags cscope

Usage

I added the following in my .bashrc to generate the tags and cscope file data:

function ta ()
{
    #clean older info
    rm -rf tags
    rm -rf cscope.files
    rm -rf cscope.out
    # generate new info
    find $PWD | egrep -i "\.(c|h|cpp)$" > cscope.files
    ctags -R . *.c *.h *.cpp --tag-relative=yes ./
}

To generate tags and cscope file information, navigate to the root directory of your project and run

$ ta

Troubleshooting on SuSE

I got the following error repeatedly on SuSE while trying to open tag search results (though it worked well on Ubuntu):

E429: File "/path/to/file.c" does not exist

Here are the steps I followed to fix it:

  1. Generate cscope.files with absolute path
    $ find /path/to/project/files | egrep -i "\.(c|h|cpp)$" > cscope.files
  2. Generate the tags file
    $ ctags -R . *.c *.h *.cpp --tag-relative=yes ./

A list of excellent tutorials, tips etc. to learn ctags & cscope:

The only plug-ins I use:

  • MiniBufExplorer
    NOTE: Add ‘set hidden‘ in your ~/.vimrc not to lose syntax highlighting when you close a buffer. This hides the buffer instead of closing them. vim has a bug which causes loss of syntax highlighting on a buffer quit.
  • Taglist
  • CScope maps
  • a.vim

Some pointers:

  • search word under cursor: <Shift-#>
  • find file with pattern in cscope: cs f fe file //check cscope_maps.vim for other switches
  • to list all possibilities: type :ts IPPR and then press <Ctrl-d>
  • <Ctrl-]> jumps to definition while <Ctrl-o> returns to previous location (double <'> does the same too). <Ctrl-i> jumps forward.

Some handy .vimrc entries to enjoy your mug of vi:

set ai //leads to scattered code when pasting in remote terminal (like PuTTY) on Windows,
//run set noai first to avoid.
set cindent //c like indentation
set ic //ignore case while searching
set incsearch //incremental search during typing
set hlsearch //highlight search matches
set ts=4 //tab length
set sw=4 //shift width, amount on shift you want in new lines
set nu //show line numbers

Happy coding! 🙂