Weird issue with reading text file in Java

Discussion in 'OT Technology' started by beez, Nov 4, 2004.

  1. beez

    beez New Member

    Joined:
    Jun 3, 2004
    Messages:
    19,143
    Likes Received:
    0
    Location:
    Queens
    I'm currently writing a little program in Java for work to remove html-style tags from text files used in a corpus. Basically, the program uses the File readLine() method to put strings into a String array. The strings are then processed to remove the tags and blah blah blah. I'm having a problem with reading in the original files. Right now this is the loop I use to go through each file:

    Code:
    			/** count variable */
    			int count = 0;
    
    			/** string array */
    			String inputArr[] = new String[100000];
    
    			/** go through the file */
    			while (fFile.readLine() != null) {
    				inputArr[count] = fFile.readLine ();
    				count++;
    			}
    
    The problem that I'm having is that every time it seems like each file reads about halfway through before the loop ends. Everything works fine, aside from the fact that only half of each file gets read. The files are all between 32 and 34kb in size. I know there aren't 100,000 lines in there, because if I dump the whole array right after the while loop exits I still get about 1/2 of the text.

    I've gone through the texts and I don't see any blank lines where there shouldn't be any. I'm really perplexed with this so any insight would help. Thanks!
     
  2. Penguin Man

    Penguin Man Protect Your Digital Liberties

    Joined:
    Apr 27, 2002
    Messages:
    21,696
    Likes Received:
    0
    Location:
    Edmonton, AB
    I have no idea about your code (I've never manipulated files in Java), but is it absolutely neccessary to do this in Java? TCL can do it in a few lines :dunno:
     
  3. sam758

    sam758 OT Supporter

    Joined:
    Aug 26, 2003
    Messages:
    901
    Likes Received:
    0
    assuming fFile is a bufferedReader, u can instead try this:

    while(fFile.ready())
    inputArr[count++] = fFile.readLine();

    also, using an array of strings is a bad idea, a better way would be to use an ArrayList or just use 1 string to store the whole file.
     
  4. beez

    beez New Member

    Joined:
    Jun 3, 2004
    Messages:
    19,143
    Likes Received:
    0
    Location:
    Queens
    Yeah it is a BufferedReader. I forgot to put the File initialization bit in there and realized it after I was going to bed. I was looking through the API and missed the ready() method. I'll try that. The other thing I was thinking of doing was just using the File.read() method to do it character by character, but I'd already taken the time to write the nested loops to go through the string array so I didn't want to bother :hsugh:.

    I'll try that with my current code and if it doesn't work I'll just use the read() method and work with the bigass string.

    Thanks!!
     
  5. beez

    beez New Member

    Joined:
    Jun 3, 2004
    Messages:
    19,143
    Likes Received:
    0
    Location:
    Queens
    I don't know anything about TCL other than eggdrop bots are written in it :(
     

Share This Page