Sunday, March 29, 2009

How do I convert between Unix and Windows text files?

How do I convert between Unix and Windows text files?

The format of Windows and Unix text files differs slightly. In Windows, lines end with both the line feed and carriage return ASCII characters, but Unix uses only a line feed. As a consequence, some Windows applications will not show the line breaks in Unix-format files. Likewise, Unix programs may display the carriage returns in Windows text files with Ctrl-m ( ^M ) characters at the end of each line.

There are many ways to solve this problem. This document provides instructions for using FTP, screen capture, unix2dos and dos2unixtrawkPerl, and vi to do the conversion. Before you use these utilities, the files you are converting must first be on a Unix computer.

Note: In the instructions below, replace unixfile.txt with the name of the Unix file you are transferring, and replace winfile.txt with the name of the Windows file you are transferring.

FTP

When using an FTP program to move a text file between Unix and Windows, be sure the file is transferred in ASCII format. This will ensure that the document is transformed into a text format appropriate for the host. Some FTP programs, especially graphical applications like Hummingbird FTP, do this automatically. If you are using FTP from the command line, however, before you begin the file transfer, be sure to enter at the FTP prompt:

ascii

Note: You need to use a client that supports secure FTP to transfer files to and from Indiana University's central systems. For more, see At IU, what SSH/SFTP clients are supported and where can I get them?

Screen capture

You can also convert files from Unix to Windows format when transferring them to a PC with a communications program by selecting ASCII text download. Select this option with your communications program to capture all the text subsequently displayed to your screen, and then enter at the Unix prompt:

cat unixfile.txt

Most communications programs will add carriage returns to the stream of text as they save it to your computer's hard drive. Once the file has finished displaying, abort the text download.

Note: This method may be slow for large text files. Also, no error checking is performed on the file as it is transferred.

dos2unix and unix2dos

On systems using Solaris, the utilities dos2unix and unix2dos are available. These utilities provide a straightforward method for converting files from the Unix command line.

To use either command, simply type the command followed by the name of the file you wish to convert, and the name of a file which will contain the converted results. Thus, to convert a Windows file to a Unix file, at the Unix prompt, enter:

dos2unix winfile.txt unixfile.txt

To convert a Unix file to Windows, enter:

unix2dos unixfile.txt winfile.txt

Note: These utilities are available only on Solaris systems. To determine what variety of Unix is running on your computer, see In Unix, how can I display information about the operating system?

tr

You can use tr to remove all carriage returns and Ctrl-z ( ^Z ) characters from a Windows file by entering:

tr -d '\15\32' <> unixfile.txt

You cannot use tr to convert a document from Unix format to Windows.

awk

To use awk to convert a Windows file to Unix, at the Unix prompt, enter:

awk '{ sub("\r$", ""); print }' winfile.txt > unixfile.txt

To convert a Unix file to Windows using awk, at the command line, enter:

awk 'sub("$", "\r")' unixfile.txt > winfile.txt

On some systems, the version of awk may be old and not include the function sub. If so, try the same command, but with gawk or nawk replacing awk.

Perl

To convert a Windows text file to a Unix text file using Perl, at the Unix shell prompt, enter:

perl -p -e 's/\r$//' <> unixfile.txt

To convert from a Unix text file to a Windows text file with Perl, at the Unix shell prompt, enter:

perl -p -e 's/\n/\r\n/' <> winfile.txt

You must use single quotation marks in either command line. This prevents your shell from trying to evaluate anything inside. Perl is installed on all UITS shared central Unix systems.

vi

In vi, you can remove the carriage return ( ^M ) characters with the following command:

:1,$s/^M//g

Note: To input the ^M character, press Ctrl-v , then press Enter or return.

At Indiana University, to get support for personal or departmental Linux or Unix systems, see At IU, how do I get support for Linux or Unix?

Also see:

This is document acux in domain all.
Last modified on August 22, 2008.

No comments: