A short while ago I was intrigued by a reply made by nullptr in this thread.

So I began to look for ways to access the rest of IE (IWebBrowser2) interface, and met limited and varied success.

Accessing for instance a <div></div> element in a wep page html document was quite difficult as there is very little documentation and examples around the web, but I got there in the end.

But I came upon a problem where I cannot seem to access an element in a frame on a webpage.
I have marked in the code, what and where the debugging app breaks with error.

IHTMLElement * getFrameElementByID(BSTR target){

    IDispatch * pDispatch;
    IHTMLDocument2 * pHTMLDocument;
    IHTMLElement * pHTMLElement;
    HRESULT hRes = NULL;

    hRes = pBrowser->get_Document(&pDispatch);
    if (!SUCCEEDED(hRes)){
        _Error(1);
        _hResError(hRes);
        return nullptr;
    }
    hRes = NULL;

    hRes = pDispatch->QueryInterface(IID_IHTMLDocument2, (void**)&pHTMLDocument);
    if (!SUCCEEDED(hRes) || !pHTMLDocument){
        pDispatch->Release();
        _Error(2);
        _hResError(hRes);
        return nullptr;
    }
    hRes = NULL;

    IHTMLWindow2 * pParentWnd = NULL;
    pHTMLDocument->get_parentWindow(&pParentWnd);
    if (!SUCCEEDED(hRes) || !pParentWnd){
        pDispatch->Release();
        _Error(4);
        _hResError(hRes);
        return nullptr;
    }
    hRes = NULL;

    long ilFramesCount = 1;
    long frameIndex = 0;
    IHTMLFramesCollection2 * pFrames = NULL;

    hRes = pHTMLDocument->get_frames(&pFrames);
    if (!SUCCEEDED(hRes) || !pFrames){
        pDispatch->Release();
        _Error(3);
        _hResError(hRes);
        return nullptr;
    }
    hRes = NULL;

    long len = 0;
    pFrames->get_length(&len);

    VARIANT vFrame;
    VARIANT ret;
    vFrame.vt = VT_UINT;
    vFrame.lVal = frameIndex;

    hRes = pFrames->item(&vFrame, &ret);
    if (!SUCCEEDED(hRes) || !pFrames){
        pDispatch->Release();
        _Error(4);
        _hResError(hRes);
        return nullptr;
    }
    hRes = NULL;

    IHTMLWindow2 * pFramewin = NULL;

     hRes = ret.pdispVal->QueryInterface(IID_IHTMLWindow2, (LPVOID *)&pFramewin);
     if (!SUCCEEDED(hRes) || !pFramewin){
        pDispatch->Release();
        _Error(5);
        _hResError(hRes);
        return nullptr;
    }
    hRes = NULL;

    IHTMLDocument2 * pFrameDoc = NULL;

    hRes = pFramewin->get_document(&pFrameDoc);
     if (!SUCCEEDED(hRes) || !pFrameDoc){
        pDispatch->Release();
        _Error(6);
        _hResError(hRes);
        return nullptr;
    }
    hRes = NULL;

    IHTMLElementCollection * pHTMLElementCollection = NULL;
    hRes = pFrameDoc->get_all(&pHTMLElementCollection);
    if (!SUCCEEDED(hRes) || !pHTMLElementCollection){
        pDispatch->Release();
        _Error(7);
        _hResError(hRes);
        return nullptr;
    }
    hRes = NULL;

    _variant_t varName;
    varName.bstrVal = target;
    _variant_t varIndex;
    varIndex.lVal = 0L;

    IDispatch * pDispatch2 = NULL;
    hRes = pHTMLElementCollection->item(varName, varIndex, &pDispatch2);
    if (!SUCCEEDED(hRes) || !pDispatch2){
        pDispatch->Release();
        _Error(8);
        _hResError(hRes);
        return nullptr;
    }
    hRes = NULL;
    hRes = pDispatch2->QueryInterface(IID_IHTMLElement, (void**)&pHTMLElement);
    if (!SUCCEEDED(hRes) || !pHTMLElement){
        pDispatch->Release();
        _Error(9);
        _hResError(hRes);
        return nullptr;
    }
    hRes = NULL;

    VARIANT attribute;
    attribute.vt = VT_BSTR;
    VARIANT myout;
    attribute.bstrVal = SysAllocString(L"id");

    hRes = pHTMLElement->getAttribute(attribute.bstrVal,2L,&myout); // Here myout VARIANT is reported as bad pointer and returns
    if (!SUCCEEDED(hRes) || !myout.boolVal){
        pDispatch->Release();
        _Error(10);
        _hResError(hRes);
        return nullptr;
    }
    hRes = NULL;

    _MessageBoxW(L"finish", std::wstring(myout.bstrVal));

    return nullptr;
}

Well actually it does not break out, but _Error(10); and _hResError(hRes); executed, it's when I add a breakpoint I discover bad_ptr.
But the value of hRes is S_OK.

I'm trying here to just get the id attribute of a div element within the fram, which is not a cross domian frame.
I'm hoping someone here has experience with COM and specifically IE, willing/able to help.

Thank you for taking the time to read.

(edit) I'm hoping to achieve this in native win32 c/c++ without the aid of dot net or any other wrappers.

Edited 3 Years Ago by Suzie999

Before I attempt to read about COM communication with IE, is there a reason you can't get the webpage instead? Must it involve an active user session? (Assuming java alters the div or something)

It seems more simplistic to open a socket and perform a GET request yourself.

Additionally, I see a lot of redundant patterns in the above code.
Is it unreasonable to create a basic wrapper class for those patterns?

Edit: Example

bool success( (ISuperParent*)COM_ptr, (void*)hRes )
{
  if( !SUCCEEDED(hRes) || !COM_ptr )
  {
    COM_ptr->Release(); // (ISuperParent*) must contain this prototype
    _Error(8);
    _hResError(hRes);
    return false;
  }
  return true;
}

This pattern occurs like 5 or 10 times...

Edited 3 Years Ago by Unimportant

I see your point about redundant code, thanks for observation I appreciate it.

I'm not sure about sockets, never used them but never used COM until last week, but I dont think I could act upon an element and manipulate it (set new values, scroll, click buttons for testing etc...) with sockets. Plus I want to learn about COM and interfaces.

Thanks for the attention and reply Unimportant.

Actually, in researching to get where I'm upto now I happened upon a lot of quite expensive software for automation testing of websites, and although it was not my goal I think it's a quite handy thing to learn and have in ones employment toolbag.

Edited 3 Years Ago by Suzie999

This article has been dead for over six months. Start a new discussion instead.