Dos2Unix / Unix2Dos utility in Linux


This article gives an overview of the "Dos2Unix" package in Linux and how exactly it works. Also, it provides the formatting details of Dos and Unix text files and how they vary exactly. And, the inter conversion mechanism as and when required. Read this article to know more details.

What is Dos2Unix / Unix2Dos utility in Linux?


Dos2Unix / Unix2Dos is a DOS / Mac to Unix and vice versa text file format converter. Text files on Windows differ from UNIX in format and thus, we need to convert them to use according to the OS, which we work upon.

The Dos2Unix package includes utilities "Dos2Unix" and "Unix2Dos" to convert plain text files in DOS or Mac format to UNIX format and the vice versa.

In DOS / Windows, the text file has a line break. It is known as newline, which is a combination of the two characters: a Carriage Return (CR) followed by a Line Feed (LF), that is "\r\n".

CR = \r = Carriage Return
LF = \n = Line Feed


But, the text files in UNIX have a single character, as the line break, which is the Line Feed (LF), that is "\n".
In Mac, the text files prior to Mac OS X, a line break was single Carriage Return (CR) character. But, Mac OS uses the same UNIX format style (LF) line breaks.

When using these utilities, the binary files, symbolic links and their targets are kept untouched, until the conversion or change is forced. Non-regular files, such as directories and FIFOs, are also automatically skipped.

Example with explanation:

Here, we have a file 'samp', which is in Dos format and is being converted to UNIX format. And, the changes are explained using various commands.

[nura@nura-PC ~$] ls -lrt samp --> file samp in Dos format
-rw-r--r-- 1 nura None 8 May 14 11:42 samp

[nura@nura-PC ~$] cat samp --> just cat samp will display only the normal characters.
ar
un


[nura@nura-PC ~$] cat -v samp --> add '-v' shows the non printing character i.e., '^M' which is called as Carriage Return (CR) or '\r'
ar^M
un^M


[nura@nura-PC ~$] dos2unix samp --> converting the file to unix format
dos2unix: converting file samp to Unix format ...

[nura@nura-PC ~$] ls -lrt samp --> when 'samp' is listed now 2 bytes are reduced since 2 '\r' characters are removed during the conversion
-rw-r--r-- 1 nura None 6 May 14 11:46 samp

[nura@nura-PC ~$] cat -v samp --> cannot find '^M' i.e., '\r'now at the end
ar
un


'od' is called as octal dump and will show the octal bytes of the characters. But, since we have given '-c' it displays as ASCII character.

[nura@nura-PC ~$] od -c samp --> since the file is in unix format only '\r' is displayed
0000000 a r \n u n \n
0000006


[nura@nura-PC ~$] unix2dos samp --> convert again to dos
unix2dos: converting file samp to DOS format ...

[nura@nura-PC ~$] od -c samp --> now the '^M' or '\r' is displayed
0000000 a r \r \n u n \r \n
0000010


[nura@nura-PC ~$] cat -v samp --> alternatively shown below
ar^M
un^M


[nura@nura-PC ~$] tr -d '\r' samp1 --> the '\r' can be removed by using the 'tr command as below, when the Dos2Unix utility is not available

Just similar to the 'samp' file, when it is in UNIX format.
[nura@nura-PC ~$] cat -v samp1
ar
un

[nura@nura-PC ~$] od -c samp1
0000000 a r \n u n \n
0000006

[nura@nura-PC ~$] ls -lrt samp1
-rw-r--r-- 1 nura None 6 May 14 11:47 samp1



Please refer the man page to understand the entire feature of "Dos2Unix / Unix2Dos" utility. This was modeled after the Dos2Unix under Sun OS / Solaris and has similar conversion modes. To run in Mac mode, we can use the command-line option "-c mac" or alternatively use the commands "mac2unix" or "unix2mac" which work similar to the above.

Sample Man page for conversion mode.
-c, --convmode CONVMODE
Set conversion mode. Where CONVMODE is one of: ASCII, 7Bit, ISO, MAC with ASCII being the default.


Read How Fedora Linux is better than Windows OS?


Comments

No responses found. Be the first to comment...


  • Do not include your name, "with regards" etc in the comment. Write detailed comment, relevant to the topic.
  • No HTML formatting and links to other web sites are allowed.
  • This is a strictly moderated site. Absolutely no spam allowed.
  • Name:
    Email: