Sorry your browser is not supported!

You are using an outdated browser that does not support modern web technologies, in order to use this site please update to a new browser.

Browsers supported include Chrome, FireFox, Safari, Opera, Internet Explorer 10+ or Microsoft Edge.

Newcomers DBPro Corner / Detecting Carriage Returns?

Author
Message
Libervurto
13
Years of Service
User Offline
Joined: 30th Jun 2006
Location: On Toast
Posted: 14th Feb 2013 01:59 Edited at: 14th Feb 2013 02:00
I just downloaded Homer's Iliad, as a text file, which is pretty darn long and annoyingly it's littered with carriage returns. So naturally I thought: "this is a job for Dirk Basic!" Problem is my program doesn't seem to be detecting the carriage returns. What is the proper method for doing so? Please tell me this is possible

^ That's what she said.
MrValentine
AGK Backer
9
Years of Service
User Offline
Joined: 5th Dec 2010
Playing: FFVII
Posted: 14th Feb 2013 02:39
Hazarding a rough guess, but is this related to the character encoding? ASCII UTF-8 etc.? or the line ending char?

A quick Bing search using detecting carriage returns in a text file brought up http://www.autohotkey.com/board/topic/417-detecting-linefeeds-as-opposed-to-carriage-returns-in-a-file/

Hope that helps...

[ The Bing Search results link ]

Libervurto
13
Years of Service
User Offline
Joined: 30th Jun 2006
Location: On Toast
Posted: 14th Feb 2013 04:02
Maybe I'm being dumb but I can't see anything useful in the link you posted.
Can I not read a carriage return/line feed as an ascii character? Do I have to read it as bytes or some other method?

^ That's what she said.
pcRaider
12
Years of Service
User Offline
Joined: 30th May 2007
Location:
Posted: 14th Feb 2013 04:39 Edited at: 14th Feb 2013 04:41
How about this?

Attachments

Login to view attachments
MrValentine
AGK Backer
9
Years of Service
User Offline
Joined: 5th Dec 2010
Playing: FFVII
Posted: 14th Feb 2013 07:35
OBese87 - look for a line ending char... Ut is hidden by default... Use some application that will show it you...

Phaelax
DBPro Master
16
Years of Service
User Offline
Joined: 16th Apr 2003
Location: Metropia
Posted: 14th Feb 2013 08:10
Read it at the byte level, not as strings.

Something similar to this:


"You're all wrong. You're all idiots." ~Fluffy Rabbit
TheComet
12
Years of Service
User Offline
Joined: 18th Oct 2007
Location: I`m under ur bridge eating ur goatz.
Posted: 14th Feb 2013 10:24
Just to back up what Phaelax is saying:

1) It will only work if it's encoded as UTF-8 (ANSI is UTF-8).
2) The ASCII values for a new line are "carriage return" (13) followed by a "new line" (10), as can be seen in the table below. In DBP code that would be text$ + chr$(13) + chr$(10). Note that this is a windows only thing. Mac and Linux only use "new line" (10) without the carriage return. That's why text files from a unix system often appear to not have any new lines when opened in a windows text editor.



TheComet

http://blankflankstudios.tumblr.com/
"ZIP files are such a retarded format!" - Phaelax
nonZero
8
Years of Service
User Offline
Joined: 10th Jul 2011
Location: Dark Empire HQ, Otherworld, Silent Hill
Posted: 14th Feb 2013 14:00
Pretty much as Phaelax and TheComet explained. But what are you actually trying to do, Obese? I mean if you just want to remove the carriage-returns then use Phaelax's method (although if its a really big file you may wanna use memblocks) and change this line,
, to
.
However if you remove all newlines then that'll kill the format. Also, there may not be as many CR's as you think because different readers wrap things differently so it also comes down to that. Then again, if the format's already such a mess, it prolly won't matter, lol. Why not get it in .mobi or .epub format?

"Quotes in signatures are just stupid, especially if you're quoting yourself" ~ me
TheComet
12
Years of Service
User Offline
Joined: 18th Oct 2007
Location: I`m under ur bridge eating ur goatz.
Posted: 14th Feb 2013 16:37
Quote: "Why not get it in .mobi or .epub format?"


I've not looked into it, but are those formats easy to read/use? I know I use them for my e-reader, but that's about all I know.

TheComet

http://blankflankstudios.tumblr.com/
"You're all wrong. You're all idiots." - Fluffy Rabbit
"Bottom line, people are retarded." - Fluffy Rabbit
nonZero
8
Years of Service
User Offline
Joined: 10th Jul 2011
Location: Dark Empire HQ, Otherworld, Silent Hill
Posted: 14th Feb 2013 17:58 Edited at: 14th Feb 2013 17:59
Quote: "I've not looked into it, but are those formats easy to read/use?"

Well, the .mobi format is because there's Kindle Viewer for MAC and Windows:
http://www.amazon.com/gp/feature.html?ie=UTF8&docId=1000765261
It's cool coz you can emulate other devices!

Mobi is supported by the default app, "E-Book", on Android tabs (but the Moon+ Reader app, despite its few bugs, gives you a better experience as you can change the background image, colours, etc which is deal for night reading).
iOS also supports a .mobi app - prolly more than one, lol.
I don't know about Linux support though but I'm sure there exists at least one decent Linux reader but have never looked into it. I also only use my "Reader" -- my tablet -- to read as I find my PC a little uncomfortable to hunch over (especially since my change to a laptop). My Kindle Previewer is really just for testing my books' appearance and formatting (stop testing appearance, finish editing and at least publish one!!!)

<rambling on-and-on>

So it's a fairly accessible format. The KindleGen app converts HTML/CSS into .mobi (although there are many tags and elements not supported but it's still fairly easy to compile your books if you ever want to publish anything. Mobi was definitely more supported when I last took a serious look into it.

I haven't looked into epub outside of my tablet but on my tablet it seems a more compatible format as far as the e-reader apps I tried go so my epub books often read better and seem to contain practically no format problems. I do blame the software more than either format though because .mobi's always look great on the Kindle Previewer, even when they get format-messed on my tablet. Perhaps there's a certain amount or responsibility to the people that compile them. Many people think KindleGen is a magic wand but one has to read the format specifications thoroughly and follow them correctly.

</rambling on-and-on>

"Quotes in signatures are just stupid, especially if you're quoting yourself" ~ me
Libervurto
13
Years of Service
User Offline
Joined: 30th Jun 2006
Location: On Toast
Posted: 14th Feb 2013 21:07 Edited at: 14th Feb 2013 21:08
It worked!
This is what I did:

I think maybe it doesn't have any carriage returns, only line feeds, because this works fine; or maybe they are still in there somewhere I dunno.

Why do my tabs go huge when I post code? Is it my browser?

^ That's what she said.
nonZero
8
Years of Service
User Offline
Joined: 10th Jul 2011
Location: Dark Empire HQ, Otherworld, Silent Hill
Posted: 14th Feb 2013 22:21 Edited at: 14th Feb 2013 22:23
Cool but you're only removing LFs. Also, those IFs will slow you down. Try changing this:

to this

as you want to remove 13 and 10 while minimising the number of nests per cycle. Ideally, you could optimize further but it'd require some GOTO magicks. Also, best practice is to declare 'b' AS BYTE before using it as it ensures DBP stores it correct. DBP does funky things with variables and memory. But if it works that's fine.

As far as checking your file, use a hex editor to view the output. My personal recommendation is the one in this PC suite:
http://arainia.com/software/gizmo/ as it handles big files fast and the suite is feature packed (Virtual HDs, Custom Menu, Code Editor, Full Shell Integration, Virtual DVD, etc) - great set of tools. In the hex editor, just look for 10 or 13. which will appear as 0A and 0D.

Hope that's of use

"Quotes in signatures are just stupid, especially if you're quoting yourself" ~ me
Libervurto
13
Years of Service
User Offline
Joined: 30th Jun 2006
Location: On Toast
Posted: 14th Feb 2013 22:37
I need to keep the line feeds that separate paragraphs though.

^ That's what she said.
nonZero
8
Years of Service
User Offline
Joined: 10th Jul 2011
Location: Dark Empire HQ, Otherworld, Silent Hill
Posted: 15th Feb 2013 20:15
Ah, now I see what you were doing, you're detecting for single newlines and culling them but maintaining double newlines. Nesting 1 level is prolly the easiest way to do it then.

"Quotes in signatures are just stupid, especially if you're quoting yourself" ~ me

Login to post a reply

Server time is: 2020-01-20 17:47:12
Your offset time is: 2020-01-20 17:47:12