Page 3 of 3 <123
Topic Options
#162364 - 2006-06-03 12:49 AM Re: Web scraping in KiX
pearly Offline
Getting the hang of it
*****

Registered: 2004-02-04
Posts: 92
I get the same results as before ERROR 404.

Well the login page was main purpose for this project, but I thought wget can be leveraged for other usage. I guess I'll have to find some other way for secured pages.

Top
#162365 - 2006-06-03 12:13 PM Re: Web scraping in KiX
Lonkero Administrator Offline
KiX Master Guru
*****

Registered: 2001-06-05
Posts: 22298
Loc: OK
hmm...
not sure you know still what you really want. or I don't know all the details still.

what does it help to get the html of a login page?
you get it once, don't you know it after that?
_________________________
!

download KiXnet

Top
#162366 - 2006-06-03 03:31 PM Re: Web scraping in KiX
Les Offline
KiX Master
*****

Registered: 2001-06-11
Posts: 12734
Loc: fortfrances.on.ca
The way Pearly keeps dancing around the truth, I think there is some malfeasance here. It looks to me that he is trying to steal passwords off the logon page.
_________________________
Give a man a fish and he will be back for more. Slap him with a fish and he will go away forever.

Top
#162367 - 2006-06-03 07:32 PM Re: Web scraping in KiX
Lonkero Administrator Offline
KiX Master Guru
*****

Registered: 2001-06-05
Posts: 22298
Loc: OK
could be, could be.
I'm still trusting and think that he is thinking the problem at hand too much piece by piece and not seeing the bigger picture and the steps needed.
_________________________
!

download KiXnet

Top
#162368 - 2006-06-05 07:08 PM Re: Web scraping in KiX
pearly Offline
Getting the hang of it
*****

Registered: 2004-02-04
Posts: 92
Quote:

The way Pearly keeps dancing around the truth, I think there is some malfeasance here. It looks to me that he is trying to steal passwords off the logon page.




Hehe, unfortunately this is not the case. As said before, this is for my job as a QA Engineer. We parse HTML all the time, but the methods we use aren't the most efficient and efficacious. The reason for pulling the source from the login page is to retrieve the dll versions posted on the page. Developers display the versions and I need this for my testing validation. With the code I posted in my first post of this thread, I was able to parse HTML Tables once past the login page, but it's only working in VBA/VBS (see my other thread : http://www.kixtart.org/ubbthreads/showflat.php?Cat=0&Number=162816&an=0&page=0#162816)

So all in all, I figured out how to pull data from HTML Tables and can access the dll versions. I can live with this for now. I'm not sure if wget or KiX COM is capable of accessing secured pages or possibly hook onto an existing browser that is already past the login page?

I hope that answers what I'm trying to do.

Top
#162369 - 2006-06-05 07:27 PM Re: Web scraping in KiX
Lonkero Administrator Offline
KiX Master Guru
*****

Registered: 2001-06-05
Posts: 22298
Loc: OK
yes, and yes.
you can use things like internetexplorer.application to get 100% IE compatibility.
_________________________
!

download KiXnet

Top
#162370 - 2006-06-05 07:32 PM Re: Web scraping in KiX
pearly Offline
Getting the hang of it
*****

Registered: 2004-02-04
Posts: 92
I know how to hook onto an existing browser, but then what do you do to access the source?
Top
#162371 - 2006-06-05 08:55 PM Re: Web scraping in KiX
Lonkero Administrator Offline
KiX Master Guru
*****

Registered: 2001-06-05
Posts: 22298
Loc: OK
can't really access the source.
you need to access the elements and their data.
_________________________
!

download KiXnet

Top
#162372 - 2006-06-05 09:58 PM Re: Web scraping in KiX
pearly Offline
Getting the hang of it
*****

Registered: 2004-02-04
Posts: 92
hmm ok. I've converted the code in this thread : http://www.kixtart.org/ubbthreads/showflat.php?Cat=0&Number=162816&an=0&page=0#162816 to KiX. Any suggestions on how to easily identify parent/children relationships between objects, collections, and such?
Top
#162373 - 2006-06-05 10:08 PM Re: Web scraping in KiX
Lonkero Administrator Offline
KiX Master Guru
*****

Registered: 2001-06-05
Posts: 22298
Loc: OK
children you get via .items collection.
don't remember can you access the parent as simply as by .parent
_________________________
!

download KiXnet

Top
#162374 - 2006-06-05 10:35 PM Re: Web scraping in KiX
pearly Offline
Getting the hang of it
*****

Registered: 2004-02-04
Posts: 92
take this example for instance :

$objWinShell = CreateObject("Shell.Application")

$obj = $objWinShell.Windows.Item(1).Document.All.Tags("TABLE").Item(6).Rows(0).Cells(0).InnerText

How do I know the relationship between all these objects/collections/properties? Is there a property that lists all objects/collections/properties for that object?

Top
#162375 - 2006-06-06 12:39 AM Re: Web scraping in KiX
pearly Offline
Getting the hang of it
*****

Registered: 2004-02-04
Posts: 92
DOM questions are probably not in scope in this forum. I will seek my answers on my own. I appreciate all your help Jooel, and NTDOC.
Top
Page 3 of 3 <123


Moderator:  Shawn, ShaneEP, Arend_, Jochen, Radimus, Glenn Barnas, Allen, Mart 
Hop to:
Shout Box

Who's Online
2 registered (Allen, ShaneEP) and 158 anonymous users online.
Newest Members
Sjaak, Biybucket, Markus1961, Ian231, kixnewbie12
17480 Registered Users

Generated in 0.038 seconds in which 0.013 seconds were spent on a total of 13 queries. Zlib compression enabled.

Search the board with:
superb Board Search
or try with google:
Google
Web kixtart.org