Deal of the Day

Home » Main » Manning Forums » 2011 » ManifoldCF in Action

Thread: Needs Multiple Examples of Web Session Based Access Credentials

Reply to this Thread Reply to this Thread Search Forum Search Forum Back to Thread List Back to Thread List

Permlink Replies: 4 - Pages: 1 - Last Post: Oct 16, 2011 7:14 AM by: karl.wright Threads: [ Previous | Next ]
mbennett

Posts: 5
From: California
Registered: 9/25/11
Needs Multiple Examples of Web Session Based Access Credentials
Posted: Oct 15, 2011 6:01 PM
  Click to reply to this thread Reply

I was excited to get the full version of the online book, but then disappointed when it referred back to the online doc for setting up logins for a Web spidering. The online doc is very vague and only gives one example. I've used Ultraseek's and Google's spider, but I still find the Session login sequences non-obvious.

I've got a subscription request into the user mailing list, but here's the parts that are not clear.

I generally understand about using regexes to define sites and sorting out content pages from login pages.

But it's not clear why there's TWO Regex's per entry. There's a "Login URL" regex, and also a "Form name/link target" regex.

It's also not clear about the "page type" radio button choices.

For "rediection", am I saying "look for a redirect event", or am I saying "then DO a redirect to this page".

And for "form name", what if my login page doesn't have a named form? In the case of the site I'm trying to spider, when your session expires, you manually go back to an https page and supply your username and password as CGI parameters. I know this sounds odd, but it's apparently how a number of the sites we're trying to spider work, some proprietary software.

Karl, I really think the book or Wiki or doc needs 3 or 4 different examples of login scenarios.

Here's the scenario I'm trying, if you'd like to use it:

Try to fetch: http://site.com/product?id=1234
If you get a redirect to: http://site.com/Main.asp
Note that there's no login form nor link on this page.
Then invoke this login URL: https://site.com/validate?username=me&password=that&otherArg=something
Note that you can't just visit this page and fill in a form, that gives an error, it has to be passed in (I think as a GET)
Then record the session cookie and try for /product?id=1234 again.

I realize this is odd, I didn't design it.

karl.wright

Posts: 36
Registered: 2/28/11
Re: Needs Multiple Examples of Web Session Based Access Credentials
Posted: Oct 15, 2011 8:10 PM   in response to: mbennett in response to: mbennett
  Click to reply to this thread Reply

Hi,

The focus of the book itself is more on integration rather than on the details of usage of each individual connector. Your comments would, I think, be more appropriate posted to the connectors-user@incubator.apache.org list.

I've nevertheless opened a JIRA ticket (CONNECTORS-275) to capture your product feature and documentation concerns. Web Connector session-based logon is always a tricky business even for people who understand the Web Connector's model very well. If you sign up for the above list and/or register yourself on Apache JIRA, we can have a more detailed exchange which will hopefully help you get up and crawling, and maybe in the process wind up adding new features or documentation to ManifoldCF while we are at it.

(Apache JIRA is at https://issues.apache.org/jira).

mbennett

Posts: 5
From: California
Registered: 9/25/11
Re: Needs Multiple Examples of Web Session Based Access Credentials
Posted: Oct 16, 2011 12:32 AM   in response to: karl.wright in response to: karl.wright
  Click to reply to this thread Reply

Hi Karl,

Thank you very much for the weekend reply. And I've been enjoying the later chapters of the book. I know when you write a book, everybody always has a "hey, whey didn't you include xyz..." comment.

However, even looking at the java doc for the web connector, I'm still not clear why here are two different regular expression columns. I could imagine somebody wanting to extend the web connector, rather than writing one from scratch, so even in a developer's book it'd be nice to see more discussion of the session stuff.

I do plan to post on the list, once my signup has been confirmed; I emailed only today.

Mark

karl.wright

Posts: 36
Registered: 2/28/11
Re: Needs Multiple Examples of Web Session Based Access Credentials
Posted: Oct 16, 2011 6:23 AM   in response to: mbennett in response to: mbennett
  Click to reply to this thread Reply

Hi Mark,

Usually when you subscribe your subscription is automatic. All you have to do is confirm it by responding to the list-daemon's request. You should have sent mail to connectors-user-subscribe@incubator.apache.org, and you should have received an immediate response. If you did not see a response then you might want to check if the response was filtered by your spam filter.

Thanks,
Karl

karl.wright

Posts: 36
Registered: 2/28/11
Re: Needs Multiple Examples of Web Session Based Access Credentials
Posted: Oct 16, 2011 7:14 AM   in response to: mbennett in response to: mbennett
  Click to reply to this thread Reply

Also, I responded to your specific questions at https://issues.apache.org/jira/browse/CONNECTORS-275.

Karl

Legend
Gold: 300 + pts
Silver: 100 - 299 pts
Bronze: 25 - 99 pts
Manning Author
Manning Staff
Manning Developmental Editor