Wednesday, February 27, 2008

A Way Around Web Query Limitations

The other day I got another request for a project that involved pulling data from a website into a spreadsheet. It was a situation where data needed to be posted to a form on the web page to get to the data that was desired. In this case, the user needed to post a start date and end date to the form to return a table of data the he wanted to pull into the spreadsheet. It was a situation that on many web pages would have been super easy using Excel's built-in web query tool and the little known ".posttext" property.

However, due to the way in which this page was built this method wasn't working. A web developer friend speculated it was due to the fact that the page was built using ASP.NET, which complicates a simple form with all kinds of behind the scenes javascript. I couldn't even get the form to post using the POST method of the MSXML.ServerXMLHTTP object (see this old post).

In the end, what I had to do was actually invoke an Internet Explorer session using the "InternetExplorer.Application" object. Once the session was created and I programatically navigated to the desired URL, I was able to identify inputs in the form by ID and put the data needed into each one, then programatically "click" the submit button on the form. Once the new page with the data table loaded, I could select the table of data again by ID, and go through each row and column in the table and pull the data into the spreadsheet. This was actually slicker than if I used my old MSXML.ServerXMLHTTP object method and used Regular Expressions to parse out the data.

This was a fun little learning experience, and I hope you Excel_Geek Insiders enjoy this code. It can be very helpful.

Later,

Excel_Geek

2 comments:

Anonymous said...

I can't find the sample code you are talking about. Where is the link?

Excel_Geek said...

I didn't post any sample code for this. The best way to get access to my code is to become an Insider.