Page 1 of 1 1
Topic Options
#178123 - 2007-07-20 10:25 PM Convert TXT file from UTF-8 to ANSI
green78 Offline
Fresh Scripter

Registered: 2007-05-02
Posts: 34
Hi folks,

I read some old topics but couldn't somehow find a solution for my problem. I'm downloading a CSV file from a website (normally file wont be greater than 500 KB) that comes in UTF-8 Encoding. What I want to do via Kix I think it's pretty simple:

$Filename = 'C:\report.csv'

Open the file in Notepad change the encoding to ANSI and save the file overwriting the old one.

Thanks !

Top
#178124 - 2007-07-20 11:08 PM Re: Convert TXT file from UTF-8 to ANSI [Re: green78]
Allen Administrator Online   shocked
KiX Supporter
*****

Registered: 2003-04-19
Posts: 4545
Loc: USA
I've used the following to convert Unicode text files to ansi... not sure about the utf-8 format though

 Code:
shell '%comspec% /c type ' + $filename + '>' + $newfilename

Top
#178125 - 2007-07-20 11:22 PM Re: Convert TXT file from UTF-8 to ANSI [Re: Allen]
green78 Offline
Fresh Scripter

Registered: 2007-05-02
Posts: 34
Hi Allen,

Sorry but I tried it - doesn't work for UTF-8 :(. There are some cyrillic symbols inside.

Basically I will be happy if I have something like:

Open the file with Notepad, change the option to ANSI, Save. That's it.

Top
#178126 - 2007-07-20 11:28 PM Re: Convert TXT file from UTF-8 to ANSI [Re: green78]
Allen Administrator Online   shocked
KiX Supporter
*****

Registered: 2003-04-19
Posts: 4545
Loc: USA
Have a look at sendkeys in the manual... that's the only thing I know of for automating notepad.
Top
#178128 - 2007-07-20 11:49 PM Re: Convert TXT file from UTF-8 to ANSI [Re: Allen]
green78 Offline
Fresh Scripter

Registered: 2007-05-02
Posts: 34
Yes, I know sendkeys, but since this script should work in locked out PC, I don't think it will be of much use... and if for a sec I lose focus...

Thanks anyway, seems this is bigger problem than I thought

Top
#178135 - 2007-07-21 05:26 AM Re: Convert TXT file from UTF-8 to ANSI [Re: green78]
NTDOC Administrator Offline
Administrator
*****

Registered: 2000-07-28
Posts: 11623
Loc: CA
I'd start searching Google to find help in this area as I don't think KiX can natively do it.

http://www.codeguru.com/forum/showthread.php?t=288665

Top
#178138 - 2007-07-21 10:04 AM Re: Convert TXT file from UTF-8 to ANSI [Re: NTDOC]
green78 Offline
Fresh Scripter

Registered: 2007-05-02
Posts: 34
Hi NTDOC,

Actually I was looking across the web for a solution that would involve nothing more than Notepad and Kix.

Manually to do this is so easy, I can't believe it's so hard to automate. What I do to convert the file manually is:

1. Open the file in "Notepad"
2. Select "File" --> "Save As"
3. Select Encoding "ANSI"
4. Choose Filename and path....

That's it.

Probably it will be easier to do this via CreateObject ("Word.Application") and then Save As with proper Encoding. I will check if this can happen and post here if something works out.

Top
#178139 - 2007-07-21 11:49 AM Re: Convert TXT file from UTF-8 to ANSI [Re: green78]
green78 Offline
Fresh Scripter

Registered: 2007-05-02
Posts: 34
Ok folks, here's what I came up with - pretty much suits me:

It is combination of VB and Kix (mainly VB does the job + MS Word).

Kix code only activates MS Word and the switch:

======================================
Kix file: Transform.kix
======================================
 Code:
$WordApp = "C:\Program Files\Microsoft Office\Office11\Winword /mMacro2"
SHELL $WordApp


Macro2() in Visual Basic in your Normal.dot in MS Word would be:

 Code:
Sub Macro2()
'
' Macro2 Macro
' Macro recorded 21.7.2007  by BOBBYD
'
    ChangeFileOpenDirectory "C:\"
    Documents.Open FileName:="report.csv", ConfirmConversions:=False, ReadOnly _
        :=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
        :="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
        , Format:=wdOpenFormatAuto, XMLTransform:="", Encoding:=65001
    ChangeFileOpenDirectory "C:\"
    ActiveDocument.SaveAs FileName:="reportANSI.csv", FileFormat:=wdFormatText _
        , LockComments:=False, Password:="", AddToRecentFiles:=True, _
        WritePassword:="", ReadOnlyRecommended:=False, EmbedTrueTypeFonts:=False, _
         SaveNativePictureFormat:=False, SaveFormsData:=False, SaveAsAOCELetter:= _
        False, Encoding:=1251, InsertLineBreaks:=False, AllowSubstitutions:=False _
        , LineEnding:=wdCRLF
    Application.Quit
End Sub


It's prety simple and can convert any TXT file from UTF-8 to ANSI... still not as clean as I wanted but if something better doesn't come up I will use it... and other encodings of course...


Edited by green78 (2007-07-21 12:00 PM)

Top
#178140 - 2007-07-21 07:26 PM Re: Convert TXT file from UTF-8 to ANSI [Re: green78]
NTDOC Administrator Offline
Administrator
*****

Registered: 2000-07-28
Posts: 11623
Loc: CA
Well I'm leaving soon but it looks like the whole thing might be able to be done with KiX.

Maybe someone like Shawn or Richard will stop by and be able to give you some code.

Top
#178144 - 2007-07-21 10:58 PM Re: Convert TXT file from UTF-8 to ANSI [Re: green78]
Witto Offline
MM club member
*****

Registered: 2004-09-29
Posts: 1828
Loc: Belgium
In your solution, you have to rely on the existance of a macro and the filenames always have to be the same. Maybe with some COM scripting, you could create a more flexible and reusable script.
Top
#178145 - 2007-07-21 11:05 PM Re: Convert TXT file from UTF-8 to ANSI [Re: Witto]
green78 Offline
Fresh Scripter

Registered: 2007-05-02
Posts: 34
 Originally Posted By: Witto
In your solution, you have to rely on the existance of a macro and the filenames always have to be the same. Maybe with some COM scripting, you could create a more flexible and reusable script.


I would, Witto, the only problem is that I can't :).

I'm newbie in Kix, and my knowledge stops here. I only found what suits me at the moment - fast conversion of a text file from UTF-8 to ANSI (cp:1251) in my case. I needed fast resolution and fully automated.

I know it's rubbish as a composite, but that's why I posted it - if someone more experienced can transform it into a more neat solution.

My guess is CreatObject ("Word.Application")
Open document... SaveAs....

And I stop here, I don't know how to set the Encoding to what I desire.

Maybe a UDF will be perfect to create, but I don't know if it will be useful for many people, and yes - you must have MS Word installed - another dependency.

Top
#178146 - 2007-07-22 01:49 AM Re: Convert TXT file from UTF-8 to ANSI [Re: green78]
Witto Offline
MM club member
*****

Registered: 2004-09-29
Posts: 1828
Loc: Belgium
Here is some explanation about the constants
Word 2003 VBA Language Reference: Word Enumerated Constants
You can reuse the macro you created. Together with some imagination and
AdminScriptEditor that reveals all the arguments, you could write something like:
;*************************************************************************
; Script Name:
; Author: green78
; Date: 21/07/2007
; Description:
;************************************************************************* 


;Script Options
If
Not @LOGONMODE
    Break On
Else
    Break Off
EndIf
Dim $RC
$RC = SetOption("Explicit", "On")
$RC = SetOption("NoMacrosInStrings", "On")
$RC = SetOption("NoVarsInStrings", "On")
If @SCRIPTEXE = "KIX32.EXE"
    $RC = SetOption("WrapAtEOL", "On")
EndIf

;Declare vaiables
Dim
$strSrcFile
, $strDstFile

Dim $wdOpenFormatAllWord, $wdOpenFormatAuto, $wdOpenFormatDocument, $wdOpenFormatEncodedText
Dim $wdOpenFormatRTF, $wdOpenFormatTemplate, $wdOpenFormatText, $wdOpenFormatUnicodeText
Dim $wdOpenFormatWebPages

Dim $wdFormatDocument, $wdFormatDOSText, $wdFormatDOSTextLineBreaks, $wdFormatEncodedText
Dim $wdFormatFilteredHTML, $wdFormatHTML, $wdFormatRTF, $wdFormatTemplate, $wdFormatText
Dim $wdFormatTextLineBreaks, $wdFormatUnicodeText, $wdFormatWebArchive, $wdFormatXML

Dim $wdCRLF, $wdCROnly, $wdLFCR, $wdLFOnly, $wdLSPS

Dim $objWord

;Initialize variables
$strSrcFile = "C:\test\test-UDF-8.txt"
$strDstFile = "C:\test\test-ANSI.txt"

$wdOpenFormatAllWord = 6
$wdOpenFormatAuto = 0
$wdOpenFormatDocument = 1
$wdOpenFormatEncodedText = 5
$wdOpenFormatRTF = 3
$wdOpenFormatTemplate = 2
$wdOpenFormatText = 4
$wdOpenFormatUnicodeText = 5
$wdOpenFormatWebPages = 7

$wdFormatDocument = 0
$wdFormatDOSText = 4
$wdFormatDOSTextLineBreaks = 5
$wdFormatEncodedText = 7
$wdFormatFilteredHTML = 10
$wdFormatHTML = 8
$wdFormatRTF = 6
$wdFormatTemplate = 1
$wdFormatText = 2
$wdFormatTextLineBreaks = 3
$wdFormatUnicodeText = 7
$wdFormatWebArchive = 9
$wdFormatXML = 11

$wdCRLF = 0
$wdCROnly = 1
$wdLFCR = 3
$wdLFOnly = 2
$wdLSPS = 4


$objWord = CreateObject("Word.Application")
If @ERROR
    ? "Error creating Excel object"
    ? "Error " + @ERROR + ": " + @SERROR
    Exit @ERROR
EndIf

;$objWord.Visible = True ;No need to make this visible
;$RC = $objWord.Documents.Open($strSrcFile,0,0,0,"","",0,"","",$wdOpenFormatAuto,65001,,,,,"")

$RC = $objWord.Documents.Open($strSrcFile) ;You will see this is enough
;$RC = $objWord.ActiveDocument.SaveAs($strDstFile,$wdFormatText,0,"",1,"",0,0,0,0,0,1252,0,0,$wdCRLF)
$RC
= $objWord.ActiveDocument.SaveAs($strDstFile,$wdFormatText) ;You will see this is enough
$objWord.Application.Quit

Here is the essential and simplified part
Break ON
$strSrcFile = "C:\test\test-UDF-8.txt"
$strDstFile = "C:\test\test-ANSI.txt"
$objWord = CreateObject("Word.Application")
$RC = $objWord.Documents.Open($strSrcFile)
$RC = $objWord.ActiveDocument.SaveAs($strDstFile,2)
$objWord.Application.Quit

Top
#178150 - 2007-07-22 11:53 AM Re: Convert TXT file from UTF-8 to ANSI [Re: Witto]
green78 Offline
Fresh Scripter

Registered: 2007-05-02
Posts: 34
Witto,

This is great. Many thanks for your effort. This is exactly what it has to be - simple, effective :).

Usage of word is inevitable in this case, since I don't see enumerations for Notepad :).

This will do perfect work, many thanks once again.

Top
#201265 - 2010-12-22 08:43 PM Re: Convert TXT file from UTF-8 to ANSI [Re: green78]
green78 Offline
Fresh Scripter

Registered: 2007-05-02
Posts: 34
Hi guys,

Would it be possible the below script to be enriched with ReplaceAll function execution? The blue text is VBScript - I hope it can be translated and the whole thing can be done via Kix only?

Thank you!

Break ON
$strSrcFile = "C:\test\test-UDF-8.txt"
$strDstFile = "C:\test\test-ANSI.txt"
$objWord = CreateObject("Word.Application")
$RC = $objWord.Documents.Open($strSrcFile)

With Selection.Find
.Text = "~"
.Replacement.Text = ","
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll

$RC = $objWord.ActiveDocument.SaveAs($strDstFile,2)
$objWord.Application.Quit


Edited by green78 (2010-12-22 09:04 PM)

Top
#201268 - 2010-12-22 11:56 PM Re: Convert TXT file from UTF-8 to ANSI [Re: green78]
ShaneEP Moderator Offline
MM club member
*****

Registered: 2002-11-29
Posts: 2125
Loc: Tulsa, OK
Well, I gave it my best shot, but I'm not able to find the correct syntax for the .execute statement. If I just do a simple .Execute() it seems to be finding the text successfully, but can't get the replace parameter right. Hopefully someone else with more COM experience will come along soon and help you out.

 Code:
Break ON
$strSrcFile = "C:\test\test-UDF-8.txt"
$strDstFile = "C:\test\test-ANSI.txt"
$strFind = "~"
$strReplace = ","
$objWord = CreateObject("Word.Application")
$RC = $objWord.Documents.Open($strSrcFile)

$objWord.Selection.Find.Text = $strFind
$objWord.Selection.Find.Replacement.Text = $strReplace
$objWord.Selection.Find.Forward = 1
$objWord.Selection.Find.Format = 0
$objWord.Selection.Find.MatchCase = 0
$objWord.Selection.Find.MatchWholeWord = 0
$objWord.Selection.Find.MatchWildcards = 0
$objWord.Selection.Find.MatchSoundsLike = 0
$objWord.Selection.Find.MatchAllWordForms = 0
$RC = $objWord.Selection.Find.Execute(Replace:=wdReplaceAll)

$RC = $objWord.ActiveDocument.SaveAs($strDstFile,2)
$RC = $objWord.Application.Quit


Edited by CitrixMan (2010-12-22 11:57 PM)

Top
#201269 - 2010-12-23 12:47 AM Re: Convert TXT file from UTF-8 to ANSI [Re: ShaneEP]
green78 Offline
Fresh Scripter

Registered: 2007-05-02
Posts: 34
Thanks a lot CitrixMan... this gives some guidelines.

I'm trying to figure it out from here - execute method as it relates to find:

http://msdn.microsoft.com/en-us/library/aa171990(v=office.11).aspx

but can't get it right \:\(

Edit: Woo-hoo I made it :), finally... here it is:

 Code:
 
Break ON
$strSrcFile = "C:\kix\test1.txt"
$strDstFile = "C:\kix\test2.txt"
$strFind = "~"
$strReplace = ","
$objWord = CreateObject("Word.Application")
$RC = $objWord.Documents.Open($strSrcFile)
$RC = $objWord.ActiveDocument.Content.Find.Execute($strFind, , , , , , , , , $strReplace, 2)
$RC = $objWord.ActiveDocument.SaveAs($strDstFile,2)
$RC = $objWord.Application.Quit


Edited by green78 (2010-12-23 12:53 AM)
Edit Reason: I made it :)

Top
#201270 - 2010-12-23 01:03 AM Re: Convert TXT file from UTF-8 to ANSI [Re: green78]
green78 Offline
Fresh Scripter

Registered: 2007-05-02
Posts: 34
Basically the whole idea of the whole exercise was to open a "CSV" file but with ~ (tilde separator) in UTF-8 encoding, remove all commas from it, then change the tildes to commas and save it as ANSI encoded pure CSV file without excessive commas.

Here is the full code for it in case someone interested in such a scenario or just replacing a string in a text file:

 Code:
 
Break ON
$strSrcFile = "C:\kix\test1.txt"
$strDstFile = "C:\kix\test2.txt"
$strFind = ","
$strReplace = "."

$strFind2 = "~"
$strReplace2 = ","
$objWord = CreateObject("Word.Application")
$RC = $objWord.Documents.Open($strSrcFile)
$RC = $objWord.ActiveDocument.Content.Find.Execute($strFind, , , , , , , , , $strReplace, 2)
$RC = $objWord.ActiveDocument.Content.Find.Execute($strFind2, , , , , , , , , $strReplace2, 2)
$RC = $objWord.ActiveDocument.SaveAs($strDstFile,2)
$RC = $objWord.Application.Quit



Edited by green78 (2010-12-23 01:06 AM)

Top
#212537 - 2017-06-03 09:32 PM Re: Convert TXT file from UTF-8 to ANSI [Re: green78]
green78 Offline
Fresh Scripter

Registered: 2007-05-02
Posts: 34
Bringing this topic up again as I have a new challenge. So far with your help I managed to automate the conversion of a txt file from UTF-8 to ANSI thru MS Word. But now the thing with server side automation is that you shouldn't be using MS Office applications for server side automation as some error messages may popup that would need interaction.

Long story short need to convert a txt file from UTF-8 encoding to ANSI but without using MS Office applications. What I managed to find online is a VB script that utilizes ADODB.Stream but when I convert it to KIX script it doesn't work for me properly.

Can someone take a look and advise what am I doing wrong? I'm getting an error of type: "Expected ")" at line
 Code:
$stream = CreateObject("ADODB.Stream")
$stream.Open
$stream.Type = 2 'text
$stream.Charset = "utf-8"
$stream.LoadFromFile "C:\input.txt"
$text = stream.ReadText
$stream.Close

$fso = CreateObject("Scripting.FileSystemObject")
$f = fso.OpenTextFile("C:\output.txt", 2, True, True)
$f.Write $text
$f.Close

Top
#212538 - 2017-06-04 04:59 PM Re: Convert TXT file from UTF-8 to ANSI [Re: green78]
green78 Offline
Fresh Scripter

Registered: 2007-05-02
Posts: 34
After some more searching on the web it looks like Powershell can be an easy solution (for the output file I use the Default encoding the machine instead of "ascii" as I need to preserve the Cyrillic characters). Guess will go with the below:

 Code:
gc -en utf8 utf8.txt | Out-File -en default out.txt


or this

 Code:
[io.file]::ReadAllText("c:\kix\utf8.txt", [System.Text.Encoding]::utf8) | %{[io.file]::WriteAllText("c:\kix\out.txt", $_, [System.Text.Encoding]::Default)}

Top
#212539 - 2017-06-06 07:30 PM Re: Convert TXT file from UTF-8 to ANSI [Re: green78]
ShaneEP Moderator Offline
MM club member
*****

Registered: 2002-11-29
Posts: 2125
Loc: Tulsa, OK
 Code:
$stream = CreateObject("ADODB.Stream")
$stream.Open()
$stream.Type = 2
$stream.Charset = "utf-8"
$stream.LoadFromFile("C:\input.txt")
$text = $stream.ReadText()
$stream.Close()

$fso = CreateObject("Scripting.FileSystemObject")
$f = $fso.OpenTextFile("C:\output.txt", 2, 1, 0) ; -2 for Default, -1 for Unicode, 0 for ASCII
$f.Write($text)
$f.Close()

Top
Page 1 of 1 1


Moderator:  Jochen, Allen, Radimus, Glenn Barnas, ShaneEP, Ruud van Velsen, Arend_, Mart 
Hop to:
Shout Box

Who's Online
2 registered (morganw, mole) and 414 anonymous users online.
Newest Members
gespanntleuchten, DaveatAdvanced, Paulo_Alves, UsTaaa, xxJJxx
17864 Registered Users

Generated in 0.074 seconds in which 0.025 seconds were spent on a total of 13 queries. Zlib compression enabled.

Search the board with:
superb Board Search
or try with google:
Google
Web kixtart.org