C++ Recursion - Topic

Member

Posts: 1,358

Joined: Dec 30 2012

Gold: 0.10

Jun 16 2014 10:07pm

I have a class with a vector of objects that are of the same class. i.e:

Code

class CTag {
public:
CTag(const std::string name): m_name(name){}
~CTag(){}

std::string m_name;
std::vector<CTag*> m_children;
};

Basically what I'm doing is parsing html and creating a tree of ctag objects.

Code

<div class="main">
<h1><a href="/">h1 text</a></h1>
<p>
paragraph text
<a href="">url text</a>
<a href="">url text</a>
</p>
</div>

Here's a simple representation of what it'd look like after being parsed into CTags.

Code

I'm trying to create a recursive function that will print out the tags in the same fashion as the html above but can't seem to wrap my head around it >.<

This post was edited by SelfTaught on Jun 16 2014 10:13pm

AbDuCt

Member

Posts: 13,425

Joined: Sep 29 2007

Gold: 0.00

Warn: 20%

Jun 16 2014 10:22pm

I don't think recursion is the way to solve this. If someone passes a long enough html log, you risk the chance of causing a segfault. I think iteration would be a more practical choice.

Although why are you reinventing the wheel. There should be plenty of libraries that do this already.

SelfTaught

Member

Posts: 1,358

Joined: Dec 30 2012

Gold: 0.10

Jun 16 2014 10:44pm

Quote (AbDuCt @ Jun 16 2014 08:22pm)

Ah yeah, good point. Could you give an example of how you'd iterate over this? Even pseudo code would be satisfactory.

I'm trying to create an additional layer of abstraction so I can parse specific data from sites, in specific chunks of html on pages. I haven't been able to find a library which allows me to achieve that without using regular expressions. And... I'm trying to avoid using regular expressions cus I've read it's not a recommend / reliable solution for parsing data from html. There are plenty of libraries for iterating over html / xml though which I'm using.

This post was edited by SelfTaught on Jun 16 2014 10:46pm

AbDuCt

Member

Posts: 13,425

Joined: Sep 29 2007

Gold: 0.00

Warn: 20%

Jun 16 2014 10:56pm

I might write something up tomorrow if you're still looking, I'm a bit tired tonight. Although try to look for a framework like something like http://nokogiri.org/

It allows you to import an entire html file, and dissect it via methods and such. In the languages nokogiri supports it even has enumeration on html entities so you can iterate over each link in the page (links are objects for example).

KrzaQ2

Member

Posts: 9,803

Joined: Jun 28 2005

Gold: 6.67

Jun 17 2014 03:31am

How can you go deeper iteratively?

carteblanche

Member

Posts: 32,925

Joined: Jul 23 2006

Gold: 3,804.50

Jun 17 2014 06:56am

Quote (KrzaQ2 @ Jun 17 2014 05:31am)

How can you go deeper iteratively?

afaik, anything that can be done recursively can be done iteratively.

SelfTaught

Member

Posts: 1,358

Joined: Dec 30 2012

Gold: 0.10

Jun 17 2014 09:16am

Quote (carteblanche @ Jun 17 2014 04:56am)

afaik, anything that can be done recursively can be done iteratively.

Yeah I think theyre both the same but conceptually different.

This post was edited by SelfTaught on Jun 17 2014 09:21am

Futu_Re

Member

Posts: 4,372

Joined: Jul 2 2009

Gold: 3,630.00

Jun 17 2014 10:05am

Quote (KrzaQ2 @ 17 Jun 2014 11:31)

How can you go deeper iteratively?

You can use a stack to store the nodes that need to be processed.

This post was edited by Futu_Re on Jun 17 2014 10:06am

AbDuCt

Member

Posts: 13,425

Joined: Sep 29 2007

Gold: 0.00

Warn: 20%

Jun 17 2014 12:32pm

Quote (KrzaQ2 @ Jun 17 2014 05:31am)

How can you go deeper iteratively?

Could iterate storing details each complete iteration eventually building a tree couldn't you?

Something like

Quote

<html>
<head>
<title>test</title>
<head>

<body>
<p>paragraph</p>
<h1>header</h1>
</body>
</html>

First complete iteration will find all nodes below HTML (head, body) stored to a list or array for processing until all the current levels nodes have been found.

so tree would look like

Quote

HTML [head, body]
/ \
head body

Next you would iterate again but find the next depth to iterate which would be head which was the first to be found, and stored in the list/array. It would first look for new nodes, if any were to be found create an array/list of them to be further searched, if not store the data contained in that node.

Quote

HTML [head[title], body]
/ \
head body
| |
title
/
"test"

Then you would do this again but this time it would result in searching title for new nodes, but none will be found so its data "Test" would be added.

Then you can remove the head node from your list/array because you completely searched it and process your next node in the array/list body, which it would find two nodes to process while iterating. From there it would process again and will either search for more nodes or their data.

Quote

HTML
/ \
head body [body[h1, p]]
| | \
title
/
"test"

End result would be something like this

Quote

HTML
/ \
head body
| | \
title h1 p
/ | \
"test" "header" "paragraph"

Is this idea any better than recursion hell if I know. I just dislike recursion because I'm not bright enough to use it for complex needs.

SelfTaught

Member

Posts: 1,358

Joined: Dec 30 2012

Gold: 0.10

#10

Jun 17 2014 03:03pm

Lol, yeah I don't fully understand which would be more beneficial either. But iteration makes sense and works for me though so I'll go with it. Thanks for taking the time to explain that.

Go Back To Programming & Development Topic List