convert text to csv bash - QuestiEssaywritercsClub

convert text to csv bash

  • Ubuntu
  • Community
  • Ask!
  • Developer
  • Design
  • Hardware
  • Insights
  • Juju
  • Shop
  • More
    • Apps
    • Help
    • Forum
    • Launchpad
    • MAAS
    • Canonical

Stack Exchange Network

Stack Exchange network consists of 174 Q&A communities including Stack Overflow , the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Visit Stack Exchange



  1. Log In
    Sign Up

  2. current community


    • Ask Ubuntu

      help
      chat

    • Ask Ubuntu Meta

    your communities

    Sign up or log in to customize your list.

    more stack exchange communities

    company blog

    • Tour

      Start here for a quick overview of the site

    • Help Center

      Detailed answers to any questions you might have

    • Meta

      Discuss the workings and policies of this site

    • About Us

      Learn more about Stack Overflow the company

    • Business

      Learn more about hiring developers or posting ads with us

This site uses cookies to deliver our services and to show you relevant ads and job listings.
By using our site, you acknowledge that you have read and understand our Cookie Policy , Privacy Policy , and our Terms of Service .
Your use of Stack Overflow’s Products and Services, including the Stack Overflow Network, is subject to these policies and terms.

Join us in building a kind, collaborative learning community via our updated
Code of Conduct .

Ask Ubuntu

Ask Ubuntu is a question and answer site for Ubuntu users and developers. Join them; it only takes a minute:

Sign up

Here’s how it works:


Anybody can ask a question



Anybody can answer



The best answers are voted up and rise to the top

convert txt file to csv seperated with tabs

Ask Question


up vote
0
down vote

favorite

Sorry people I’m new to Linux, and while I looked through the list of answered questions, I don’t know enough to recognize if my question was answered. Or if I can adapt one of the answer to my particular little problem.

I get a text file of data from my boss, who learned to use computers one way; and he won’t change. The data is almost a csv file, except the fields are all separated by a space characters rather then a comma or tab character. And the text fields of data include embedded spaces also.

Each field is either a number or is numbers and text, all fields are of varying lengths, and none are off-set with single or double quotes. The number fields predominate, and no text field is adjacent to any other text field. Rarely is an embedded number in a text field preceded or followed by a [space] character.

Unfortunately not every [space] character can just be replaced. Instead because, generally, field breaks come in the form of either [space]7 or 6[space], this is how I determine if a [space] character should be converted to a [tab] character or not. If the [space] character is beside a digit its to be converted to a [tab] character.

So using the Find/Replace function in Notepad for Windows, I search for a digit-space or a space-digit combination, converting that [space] character to a [tab] character. I have to do this ten times 2[space] and then ten more times [space]7.
I’m looking for a script to do this automatically.

Here is an example of the file I get. It contains four fields separated by [space] characters (first line). Each following line is one record, so the second line is the first record. Account is 2281, Units are 19, Description is Toshiba PX-1982GRSUB{, and finally the Delta field contains the 0:

Account Units Description Delta
2281 19 Toshiba PX-1982GRSUB 0
9618 200 HP MX19942-228b -25
19246 4 CompuCom HD300g Hard Drive 4

So what I’m looking for is a script that will read the original file, convert the [space] characters that are field separators into characters and write it all to a new file. And I want the explanation — so I don’t keep asking the same questions over and over again.

command-line bash text-processing csv
share | improve this question

edited Jan 25 ’16 at 12:14

David Foerster

25.7k1361105

asked Jan 21 ’16 at 3:17

Bobby H.

3116

  • 4

    Can you post a sample input and a sample output? I mean I think it’s clear enough, but just to be sure. Also please clarify whether fields containing text could also contain digits (although I’d assume they couldn’t).
    –  kos
    Jan 21 ’16 at 3:24


  • Are any text fields with spaces formatted like "this with quotes or similar?"
    –  Wilf
    Jan 21 ’16 at 3:51


  • And the "fake tabs" are they always the same number of spaces? In short; we need an example, like @kos requested.
    –  Jacob Vlijm
    Jan 21 ’16 at 6:48

  • Kos: Yep see what you mean; yes ’embedded numbers’ are present in otherwise nominal text fields. Rarely the number is led or followed by a <space> rather then some other character. When that happens I either add an unused character like @$%^&*, so it gets skipped when I use "Replace All" or I manuelly edit that section prior to importing into the database or excel sheet. Wilf: No, the text fields are not seperated by single or double quotes. And the text fields can begin and end in a wide variety of ways. The number of different text fields is finite but very very large.
    –  Bobby H.
    Jan 21 ’16 at 19:37

  • 1

    Could you please add those informations and the various examples to your question by edit ing it so that users will be able to read them directly when reading the question? Thanks.
    –  kos
    Jan 22 ’16 at 18:15

 | 
show 13 more comments

2 Answers
2

active

oldest

votes


up vote
1
down vote

accepted

A web search for “replace space with comma” was very fruitful, didn’t that work out for you first? Would’ve found lots of answers like this :

tr ' ' ',' < input > output

or for tabs:

tr '\t' ',' < input > output

and

sed 's/\s\+/,/g' input > output

\s is the space class (like [:space:]) and should replace any runs (+ (escaped) = one or more of the preceding character) of spaces or tabs or newlines too. This next one would only replace each single space or tab with a single comma (like running both above tr‘s) :

sed 's/[ \t]/,/g' input > output

And -i edits the file in-place (directly edits the file) in sed

Here’s a sed that will match a space-number or a number-space, and replace them with a comma, using the OR command/symbol | escaped as \| below:

sed 's/ 4\|1 /,/g'

share | improve this answer

edited May 23 ’17 at 12:39

Community

1

answered Jan 23 ’16 at 0:56

Xen2050

6,26912040

  • Xen2050 — thank you. unfortunately ‘tr’ didn’t work. I want to search for a token 2 characters in length. as far as I can tell ‘tr’ only parses 1 character. Still looking at ‘sed’. and yes I did look (and am still looking) but I haven’t yet found anything that does what I want.
    –  Bobby H.
    Jan 23 ’16 at 1:47

  • Xen2050 — Thank you, again. ‘sed’ worked. The way I did it, its clunky, but it will save me about an hour (even clunky) every time I have to handle the text files the boss gives me. And I have a point from which to advance now. Again thank you. And thank you all for the help.
    –  Bobby H.
    Jan 23 ’16 at 2:37

  • No problem. That is a shortcoming of tr, no multiple characters. sed works well with just about any string, something like sed 's/xy/,/g' is probably what you’re using I’m guessing. Maybe sed 's/ 3/,/g' and sed 's/4 /,/g' Even does regex’s that can get pretty complicated & useful too, I’m guessing one of them would match space-num or num-space… Good luck!
    –  Xen2050
    Jan 23 ’16 at 7:54


  • Actually, thinking about matching a number-space or a space-number, and replace them with a comma, this might work in one go too: sed ‘s/[ 0-9][ 0-9]/,/g’ -> no, that one matches two space and also two numbers… close though. Found one that DOES match, with the OR command/symbol |, escaped as \|
    –  Xen2050
    Jan 23 ’16 at 22:58


  • Yep Basically. But I am searching for a number-space/spce-number combination and then replacing it with number-tab/tab-number: where the number is one of 0-9 and the replaced number matches the searched for number. Now I’m searching through the ‘man’ pages to see how I can do it more elegantly.
    –  Bobby H.
    Jan 24 ’16 at 23:23

 | 
show 1 more comment


up vote
0
down vote

Ok, so you need to replace the first two and the last space in every line with a comma. You can’t just replace every space, because the 3rd field may contain spaces itself. You can do this with regular expression replacement. Here’s a sed script/command, that works:

sed -re 's/^(\S*) (\S*) (.*) (\S+)\s*$/\1,\2,\3,\4/' in.txt > out.csv

With the above example this returns:

Account,Units,Description,Delta
2281,19,Toshiba PX-1982GRSUB,0
9618,200,HP MX19942-228b,-25
19246,4,CompuCom HD300g Hard Drive,4

This is still quite fragile with handling empty fields and breaks entirely, if columns other than the 3rd contain spaces. It’s very easy to introduce such malformed data if it is formatted manually as done by your boss. You should suggest to him to switch to a more robust table format (e. g. proper CSV & Co.) and editor (common spread sheet tools can manipulate CSV quite well and flexibly, e. g. LibreOffice/OpenOffice Calc, Microsoft Excel and Google Docs).

share | improve this answer

edited Jan 25 ’16 at 23:18

answered Jan 25 ’16 at 12:26

David Foerster

25.7k1361105

  • Wow! Really nice. Thank you. That is much better then I did, and I have a bunch things to look up and figure out.
    –  Bobby H.
    Jan 25 ’16 at 21:52

  • As you’re a reputation 3 user: if you prefer this answer, you can select it as the accepted answer instead of the currently accepted answer.
    –  David Foerster
    Jan 25 ’16 at 23:13

  • David — sorry took me a while to figure out what your command did. It works beautifully for a four field text file. Unfortunately I get text files with a variable number of fields, so i would have too look it over to determine the number of fields and either modify the script or send it a switch to set the field count. The solution I chose would work regardless of t he number of fields. But your command is much prettier then my code; and compact also. Still thinking how I could empamnet your code in my solution though.. Thanks again to you and all the rest — B
    –  Bobby H.
    Feb 2 ’16 at 1:46

  • How are the additional fields structured? Can you extend your question with a description of the new requirements? Or better yet, ask a follow-up question referring to this one. It might also be better to ask this on Stack Overflow , where you would get a much broader audience for your platform-independent text processing problem, since you don’t care about the programming language (shell script, sed, Awk, Perl, Python…).
    –  David Foerster
    Feb 2 ’16 at 7:26

add a comment  | 

Not the answer you’re looking for? Browse other questions tagged command-line bash text-processing csv or ask your own question .

asked

2 years, 6 months ago

viewed

10,556 times

active

2 years, 6 months ago

Related

-3

Help with bash script with text

0

How to extract a record in a text on string match in a file using bash

2

Print multiple lines starting with “D” after multiple greps

4

Data extraction from a text file using bash

1

retain white spaces in fields when using awk

0

How can I count lines of differently named files, and write the outcome to a csv file?

3

How do I transpose a row to a column in a tab-delimited file?

4

Create csv from inconsistent text file

1

How can I create a CSV file from a directory listing with multiple columns based on the file names?

2

Checking if the end of each line in the file is ending with a letter followed by 8 digits number

Hot Network Questions

  • Codegolf Rainbow : Sorting Colors with Reflection

  • Why don’t bond makers just get loans?

  • How to set up a persistent TCP gender-changer proxy?

  • Cancelled 3rd round interview, Can I get another chance?

  • Why is peer review so random?

  • Why did I have to wave my hand in front of my ID card?

  • Is a pig-mounted cavalry possible?

  • How to express my concerns to a pontentially new landlord?

  • What is "roots and hacks"?

  • Can two universities have the same name?

  • A name for the sound of liquid discharging from a bottle into a glass

  • How do I draw a box with holes?

  • How can I make my girlfriend not to get fixated on false facts and listen for reasons?

  • Listing a Stack Overflow user as co-author for having provided substantial programming support

  • Unity3d – Do I need to destroy gameobject AND script?

  • Is this meme about the former prime minister of India true?

  • How useful is an impregnable castle?

  • CodeFights: Frisbees

  • Why didn’t the Spacecraft used for the Apollo 11 mission melt in the Earth’s Atmosphere?

  • Why do ex-government employees have security clearances?

  • How would one attack or lay siege to a flying castle?

  • Should you always try your best (play as if your opponent is a grandmaster)?

  • QGIS: Delete all records/rows with NULL values

  • What are the mechanical and role-playing advantages of playing a human?

more hot questions


question feed

lang-bsh

Ask Ubuntu works best with JavaScript enabled