Apply sledgehammer gently to CPU

So, I discovered that the import file of my old posts has a lot of mangled markup and incorrect hard line feeds in it. This may be due to:

  • Taking it straight from Linux but uploading it from Windows to a Linux server
  • It had mangled markup in it already and WordPress did its best to assemble things cleanly.

Now my options are:

  • Go through more than 500 pages and correct things by hand.
  • Clean up the original file back in Linux with Regexxer or in Windows with Funduc’s Search and Replace.
  • Delete the original entries and then re-import with a clean file.

The latter two sound much better to me. (Later: So I go through the file with regular expressions and WordPress still mangles it. Argh! I guess I have to tediously hand edit.)

  1. Michael Hutchinson Sr says:

    Delete the original entries and then re-import with a clean file.

    • Pace Arko says:

      I tried that and it didn’t work. The problem isn’t the file. The problem has to do with the way WordPress brings the data in and converts it for use in MySQL. For some reason it inserts a bunch of hard line breaks when there should be none.

      As it is I have to use another tool to quickly remove the line breaks in each entry–sigh–one by one.

      Or maybe, and this is more risky, I could see what table MySQL keeps my entries in and export that table as a text file, clean it up with regular expression based search and replace, and then import that corrected table back in.

      I’ll have to see what kind of front end Hostway provides for MySQL.

  2. Michael Hutchinson Sr says:

    or are you done already ???

  3. Michael Hutchinson Sr says:

