Monday, February 19, 2007

enterprise introspector restarted

As some of you who know me longer, I have long dreamed of introspector functionality for the enterprise.

The challege is to have enough information about your business so that you can reference your data files against it.

After the process of converting my older ontologies into swoop ones, the cleaning up of my original work to collect and understand business software

I am using sql-ledger as an example of how the full application stack of a enterprise application can be introspected.

I have opened up a sourceforge feature request for posting the files.

We have a raw database model extracted with a small tool using the DB-Introspector . Here is a mini owl mode for the database, separated out so that it can be updated later.

so far, I have an mini owl model for the business, which I should describe better .

basically, I split the tables into entities, relationships, and resource descriptors and physical resources.

transactions are just relationships over time. projects are just long (recursive) transactions.

an address is just a resource descriptor.

I will continue posting more and more data about sql-ledger as this project continues.

here are some of my plans :
create a model of the menu system in owl. be able to addresss each menu point in some business workflow.

create a test case for each menu system that collects the form data for each menu. capture a trace of the entire application stack.

be able to reference each part of the trace to each part of the ontology.

creation of a owl document that describes each form. be able to show the mapping of the form fields to the database table fields, all the way to the translation strings.

in the end, we should have a rdf descriptor for the entire sql-ledger system.

with that we can then create an rdf descriptor out of a single instance of the system.

hope to hear from you,

mike

Sunday, November 12, 2006

Saving "Insane in the Kernel" from chat

you can see a draft of my song on the gentoo site :
http://dev.gentoo.org/~vapier/wtf
so i wrote a little song
to the tune of insane in the membrane
from cypress hill
lyrics : http://www.citay.de/texte/a_d/cyp_ins.html
Insane in the kernel
insane in the shell
a hacker like me is going insane
insane in the kernel, insane in the shell
Like stallman hacked that emacs
I'll code this here script
and release you an rpm
soon i gotta get my cvs update
Microsoft trys and patent my code
these crackers want to root my server
head underground to the next project
they get mad when they come to raid my server
and i take off in my ssh -C connection
Yeah, i am the hacker
the pilot of the this here project
when i dream of the ultraviolet software
and hide from the microsoft cronies
Now do you belive in the unseen?
Look, but dont make your eyes strain
a hacker like me is going insane
insane in the kernel, insane in the shell
repeat
wtf !?

Meme War : Software Industrial Complex vs Free as in Freedom

Orginally posted to Groklaw :

Authored by: Anonymous on Monday, November 06 2006 @ 04:22 PM EST Pam, you write :
"I'm very sad about Novell. Whatever they thought they were doing, they are
now Microsoft's FUD puppy, and contractually they will be having to repeat
Microsoft's FUD with every deal, I think. Every time they tell a prospect that
they have a patent peace with Microsoft, they are implying that one needs one,
and the damage to Linux's good name is obvious right there."

Let us take that statement, replace Novell with Mono, and Linux with DOTGNU.
If you reread the sentance, you will see this entire pattern has occured before.


Then we add to the idea chain here :

MS - ximian(mono) - Novel - Suse

Bang, we see the strong connection between ximian and microsoft has taken place,
and how the huge fight between the two occured. The industry backed mono with
book contracts. You see how contractual agreements underline now what was before
just a underlying disturbance in the force.

How long has this disturbance in the force been there? I think for much longer.
Something is deeply, darkly wrong at Novell and Microsoft. They are part of the
SIC. The software industrial complex.

Pam also continues to question the timing of this thought:
"How could Novell not see that? Is it too late to nix this devilish
deal?"

The devilish idea could have very well happened much earlier, even before when
the midnight commanders author decided to throw the gpl overboard on the mono
projects classlibs and break away from working with the gnu dotgnu p/net
project.

It has been said ximian has very good relations with Microsoft for a long time.
It has only been to their benefit, except for the disturbances in the free
software community. Hurt feelings are not something that corporations need to
care about really. People are hurt all the time, it is just business, nothing
personal. What does Freedom matter anyway? Much better we just take it away from
the user, so they cannot leave. that is what businesses need: stable, and
non-free workers bound by all sorts of contracts, visible and invisible. I have
worked inside the SIC for years, and can tell you that It loves microsoft
office!!!

Microsoft has been creating many good relations with developers, and turning
them away from free software for years. But that is just business.
I was also a microsoft junkie, until linus brought me GNU! It was not the FSF
that got me GNU.

Some of the reasearchs at microsoft all use Open Source, GNU and BSD tools,
because they are the best, you can find references to that on the research pages
at microsoft. Microsoft has also restributed perl for years and other tools. I
remember even seeing xenix from microsoft at radio shack in the 80s.

Intel is also trying to get into the linux software business after downturns on
windows also affected thier sales. It is the nature of the market that they are
all tied together in a massive web of interdependancies that define the
technology market.

Now what if Intel would finally find itself needing to develop its own version
of linux that is optimized for its chip? What about a linux on the chip, burnt
right in an optimized out. A linux chip.

Think about what would happen if your chip contained the ability to compile new
highly optimized programs, a compiler itself. Take the gcc and turn it into a
chip!

Imagine to be able to create even new computer chips, or rewire them using nano
robots. A fab chip that contains an entire IC FAB on a chip itself. with
nanobots working on it to produce new chips inside it.

All these new technolgies can be implemented open source tools. Of course to
produce such chips you need to be the most advanced manufacturer on the planet,
but you will still need software. Why not let the people own the software?

Open Source software is Adaptive software.

It has a high viability because it copies itself freely consuming all available
space. It tends to consume software developers completely until they turn into
memiods of a given software defending it to death. That is the true greatness of
Free Software, it is the great minds that have been attracted to it. It is the
ability to see them interact an watch how they think, how the software grows. Of
course we see that at in the SIC as well. But you dont get to see the sources,
or have time to understand things most of the time when you work inside the
SIC.

What if the rate of change in the software would reflect some type of metric, we
would watch the rate of change of open source versus closed source.

In companies, the rate of change in the source is defined by the contractual
flow of money to the software engineering process that delivers to requirement.
Things dont change for years and each line of software is so expensive that
there better be a good reason to change it.

In free software, the rate of change in the source is defined by the meme
strength of the software to copy itself onto a developer, who then ebodies and
carries it. See the Egotistical Meme from Dawkins for more about memes.

Now we can apply some dawkins game theory here :

We will see many creative people developing new creative ideas, so, there is a
chance that the meme will mutate into something new and exciting. Lets call them
the doves.

Yet not all play for the gain of all. We look at predators. How many of them
will turn against the meme and go against it. Someone like me, growing up as a
Microsoft memiods turning into Free as in freedom GNU/linux memiods. Or someone
who grew up in Freedom turning against it ie: Ximian.

The population of the software development market is attacked by waves and waves
of memes searching for hosts. Each one hopes to capture a developer working on
it. Each one has some scheme.

The closed system scheme is built around a soft landing. Microsoft software
development tricks you with wizards that hold up the light for you to walk in
the dark, but lead you down the path into complete dependancy. It is a warm and
fuzzy place.

The Free Software movement confronts you with someone who is not getting any
good press. In fact the newspapers seem to go out of thier way to not talk about
free as in freedom at all. I almost choked the other day when the FAZ was
talking about creative commons and the wikipedia. The capitalistic press just
cannot handle GNU.

they find the idea of Free as in Freedom distasteful, I think. It must have
something to do with the word Manifesto.

Most industrial companies feel the need to control the freedom of thier workers.
Maybe they have to do as well, and there is the real core of the problem.

Lets view the world from that of an egotistical meme that has an army of
memiods. Lets call this meme "SIC" (the software industrial complex)
we can define it by a simple set of rules :
1- Those who have must protect it from those who dont.
2- Those who dont must have a problem, so we sure should not help them, they
might multiply.
3. What better way to protect your own, when you can just disable the
competition with FUD.
4. Capture the mind of the mentally weak, fill them with ideas, make they want
to buy our bugs.
5. The stronger ones we will give them real benefits to control the weaker with
our FUD.
6. Create a hierarchy of FUD that trickles down to the office level and floods
the minds of the workers.

Now, let me tell you the real cost of msoffice to the SIC, it is the cost of
training slow neurons. No one wants to do it. They might start a riot.

What is software all about for the SIC anyway? Its need OFFICE for the sheer
cost of brainwashing and retraining all those neurons! And to think that the SIC
has been investing in these software memetic brandings for a long time! It is
alot of energy invested into, so it must have some purpose.

Just look at the cost in calories it would take to retrain the nation to use
open office!! What a waste of resources, we should let them have office.

give the people the ability to learn linux. that goes against the entire idea of
a empire of SIC.

Seriously folks, lets spend those resources on something worthwhile, like giving
internet connections and computing power to the third world. Let's teach the
world to sing in perfect harmony! Lets set a sample for future generations and
share with them what we know. Why not let them see how we developed software?
Why not share with them something we have worked hard on?

How many of us are willing and able to put work into becoming the perfect free
software memiod? Who is willing to make that sacrifice of time and resources.

Do we not need freedom to have freedom? If we dont have a computer, then we
cannot enjoy GNU. If we never learn to read we cannot program GNU. GNU needs
young minds to copy itself onto. Fresh neurons. We should invest more into the
third world software development. but how can you invest without money? No
money, no calories for neuron imprinting.

Anyway, enough for tonight.

mike

---------------------
Update :
I have found a nice page that gives more information Softpanormas Stallmans Page:

Donations pay for expenses, not ailing kids' dreams) are applicable to FSF. Moreover additional question about possible conflict of interests is perfectly applicable too. It looks like FSF accepted generous donations from Eazel. At the same time outspoken Eazel's co-founder, Miguel de Icaza sits on the board of directors of the Free Software Foundation. At this point RMS words "Go Get 'em, gnomes!" appear to have a quite different, more troubling meaning. As Denis Powell noted in his paper Wanna Invest in a Bridge Okay, How About a Donation :

Here the linux planet note about Ximian/FSF

...Because, you see, it seems as if not all information wants to be free. The financial records of the Free Software Foundation, for instance. I've repeatedly requested them, and those requests have gone unanswered. It is a peculiar irony that I can easily learn far more about the financial dealings of Microsoft Corp., than I can about the Free Software Foundation, where information wants to be free so long as it's other people's information.
I am not alleging impropriety here. It could be that it's all mere coincidence. But it is absolutely undeniable that the FSF has thrown its support behind a desktop controlled by two for-profit companies, one of which has an officer who sits on the FSF's board; the same company has purchased advertising aimed at confounding those who are seeking a desktop that is truly free in every rational sense of the word; and the other company has suggested that users can assist its product in surviving but help it avoid paying its bills by donating to the Free Software Foundation, or else an officer of that company has flung down and danced upon his fiduciary responsibilities by saying, in a communication that is part of his corporate function, that people might want to send money to the FSF instead of the company. And they all do it, evangelists as they are for "free" software, with a holier-than-thou air.

Saved the GNU Choo Choo from MSN/M$/Ximian/Novell/Suse

MSN has taken my BLOG offline for some reason.

my link is broke
I get an access denyed
but the page still lives in google :

This was saved from the google cache:

December 08

The GNU Choo Choo

see see http://ingeb.org/songs/pardonme.html for lyrics for lyrics

pardon me boy, is that the GNU choo choo ?

can you afford to board the GNU choo choo?

you leave the M****$oft station at quarter to four

when you hear the whistle blowing at eight to the bar, then you know that hurd os cannot be far!

shovel all the code in, got to keep committing

Ohhhhhh GNU Choo choo there you are!

Theres going to be a certain party at the station

RMS is going to cry until I promise to never say open source

GNU Choo choo.... oh there you are..

Tuesday, October 17, 2006

Summit Systems API Wikipedia Node launched

Press Release : call for public participation in documenting the summit systems api.

I call out to all the people who want to know more about the summit systems API to pitch in and help add in new links and web snippets to the article.

here is one part of it :

Summit API Package Names

  • API Toolkit
  • Accounting API
  • Risk API
  • STP API
  • Hedge API
  • Cash Flow
  • Documentation/Document
  • Financial Toolkit
  • Gateway
  • Interface
  • Loader Server/OpenLoader
  • Open DSAPI
  • ValueList
is this correct? please update the wiki

We need to find people who have this precious knowledge to help explain what this whole thing is and how it works.

wikipedia.org/Summit_Systems_API

The wikipedia is a good place because we can combine the terms from finance and computing into the model expressed in the wiki.

The reason for the blog post is to get it into the rss feeds, wikipedia content is not that quickly indexed.

mike

Wednesday, October 11, 2006

Need for a spamfilter directly into firefox and to use fact++ as a box engine to box in spam.

I would like to ask you to listen to what I think is my new idea :

A new firefox browser plugin that finds spammers and allows you to augment your html elements with an overwrite the class to class=spam to content that is spam, even apply user defined stylesheets to it like to make it smaller or red, etc. Advertising could also be tagged as such. Interesting content as well.
Javascript snippets as well.

The key to making this as a semantic web application is the YouTube effect allowing people to post the best spam rules and earn the most recommends.
Some people will add the new alias of a yahoo spammer via a simple xslt script that generates a web2.0 enhanced stylesheet javascript and a virtual server where the reasoning engines lives and earn a couple of xp. Others who create intelligent spamrules that cover whole classes of spam and will rise to the top of the spammer hackers community.

Others will setup servers and rent professional space where these reasoning engines live and provide them with large caches for running efficent lookups. These servers will run the forground web2.0 process for the users, allowing them to antispam and filter and deliver the web content. It will be sliced into parts and dissected. Then served to you in a steady stream of data context pages each containing logically related data that is cached together. Execution contexts. Basically a program that is executed on your computer that you trust.

I want to have an ssh server that I connect to or some way to prevent my web accounts from being accessed by someone else. For example, If I know that I wont access my webmail except from one computer, then I can add in such a rule. It could be changed from the administration interface that can access in a more advanced way. If yahoo agrees to allow me to limit access.

The whole point is that the common man will be willing to pay a small price for a n secure private antispam webhost filter that he could use anywhere. They would pay rent for servers that run web2.0 apps for them from anywhere.

This announcing of just an idea, limiting it to firefox and using free/open source software is a strategy that requires no risk and protects community assets.

What is more valuable to mankind than reliable and up to date information.
If we consider free/open source as the best way to create a web of trust and honor among mankind and unite all people then we must see that it is also creating an incredible capital potential, up to date and reliable information. Each line of source code is a statement about some real or abstract thing that is described.
When source code is published under an free/open source license then it is accessable from around the world and for all times as a free item. It has the potential to solve very many problems and add a positive gain to the economy. Thus the economic impact of free software is great.
It does not have a paying lobby, like other software giants, which is why most big conservative newspapers dont report on them often or in postive light. You never see much intelligent analysis of open source by the economist. Unless it is something that has muscle like oracle, and willing to post full page adds, you dont get much press.

Just look at the wikipedia. It is the most visited page in germany, more than youtube according to the FrankfurterAllegemeineZeitung today. That shows you how open ideas are more important than just entertainment. Wikipedia is a platform for people with something to say. This antispam system should be as well.

The semantic web ontology engines, there are many of them, cwm from timbl written in python, pellet written in java, fact++ written in c++ could be added in to allow users process the results of the bayesen filter themselves as an rdf datastream. Using an open source project means that you can also get a hosting at sourceforge for free and run example servers on there.

Users would be abler to define and share spam ontologies.

Those ontologies could be used to augment the editing of the spam. The existing bayes spamfilter could be used to view as each web page as an email being sent from the person who the antispam software thinks is the originator the sender of the message. We would try and trace each part of a webpage to its originator, examine its url content and match that against our spam database.

So you would allow people to define thier own own rules in a web 2.0 environment.

All this information can be also defined in a web 2.0 environment, imagine an very cool web 2.0 spam ontology editor app that would allow you to share your spam rules with other people.

Other ontologies would be used to describe the network of interlinked servers and paths the spammers use to hide themselfs.

This ideas came from my original intent on mentioning of the need for yahoo anti spam software to filter out messages that are sent 2 years in the future.
I Needed to manually filter out messages sent from my yahoo mail sent in 2008!
I also Need to filter out messages in many other languages that I dont speak.
Infact it would be great to build antispam directly into firefox.


Mike

Monday, September 18, 2006

The Web 2.0 will produce Porn 2.0 but not the Porn:Ontology#FreePorn

reposted from a Submission to http://www.oreillynet.com/xml/blog/2006/06/the_7_flaws_of_the_semantic_we.html

This is a reposting about my thoughts on this thread before, which have not been published on the oreillynet yet, that is fine, but I would like to get a copy of my post please? Basically I said that Ajax was Sexy and that the semantic web is not viable to sell sex, that is why the Web 2.0 will produce Porn 2.0, but not the Porn:Ontology#FreePorn.

Let me restate my point about the advertising without going into name of the the #1 consumer of internet advertising : The semantic web seen as a pure web of logic is not viable because it cannot be used for advertising. Otherwise it will be forced to contain opaque data designed to stop logic and appeal to the more primative forces . Thus you will always have chunks of data that are opaque. For them to be only small chunks, then they could be filtered out. Therefore the chunks of advertising have to look the same as the rest of the semantic web. But in a closed, secure semantic web of trust there will be be no way for such information to be hidden, thus it is excluded.

This is not the problem of the Web 2.0. It can be the advertisement and the logical content at the same time. the user can be lead to something that they dont even want, and then the search engines will get money for that.

This fuels the industry and that industry is powerful.

see a quote of my previous post here :

The Content Wrangler, Inc. (presumably Scott Abel) writes :

"Nowadays, adult entertainment companies are not just leaders in earning revenue from the Net, they’re also leaders in the technology arena. In many areas, they are the dominate force. The leaders, not the followers. And, they’re doing as much as possible to protect their turf. They file patents to protect their content matching algorithms and online content management and manipulation functionalities. "

Thanks for listening,

Mike

Wednesday, September 13, 2006

Google Blacklisting of my Post on "Why the semantic web cannot work"

I would like to complain about google blacklisting my post on the search results.
"No, Google I Don't mean" on the Google Blogsearch
does not return my post.

"Why the semantic web cannot work" also nothing.

Searching on Blogger Com returns 3 hits including one reference to me.

I even made the quote of the day :
08:55
QOTD : pants
The reason why the semantic web cannot work is that it cannot be used to trick people into looking at pay porn sites. - Mike Dupont
from Danny Ayers | Langemarks Cafe

Now, yahoo does much better!


Msn Even finds a related post :

Here http://www.spitting-image.net/archives/2004_05.html

Here Comes the Semantic Web?
Although many skeptics point to the historical failure of Strong Artificial Intelligence and the logical inconsistencies of human consensual reality as reasons why the Semantic Web cannot work, my view is that the Semantic Web is going to be bigger than Google in terms of its ultimate impact on civilization. It will be monstrous. Huge. We cannot even predict what it will be used for...
article w/links
---I lived and worked most of my life with people with cognitive and language disorders. I think I've an idea what interacting with the Semantic Web will be like.
Posted by Cieciel at 02:44 AM

Saturday, August 19, 2006

Configuration tools part 1 - lsc large scale c++ programming from lakos

I have started a repacking of the cdep, adep, ldep from john lakos.
ported it to g++ 4.03, still has some crashes in the cleanup of ldep. marked the source code, maybe someone has time to look into this.

prdownloads.sf.net introspector LSC-rpkg-0.1.tgz

Friday, August 04, 2006

Photographs of some of my notes





 Posted by Picasa

Sunday, July 30, 2006

No, Google I Don't mean "Gay Films"!!! Why the semantic web cannot work, the properties free and good need to be porn:OpaqueData and porn:Misleading

Hi all,

I am looking for open source tools to deal with amazon
and that have book interfaces.

After using debian apt-cache search amazon to select a couple of
packages that looked interesting i then googled for them,
" alexandria cowbell gcfilms ".

Although I have moderate search filter turned on,
and personalized google search suggested :
Meinten Sie: alexandria cowbell gay films

I was pretty shocked, and after I turned on strict filtering
it still returned the same thing.
well, I guess google is just catering to its porn clients.

Hopefully this will get slashdotted and google will clean up its act.

Now, on the topic of porn, I would like reiterate on my view of
the semantic web.

The reason why the semantic web cannot work is that
it cannot be used to trick people into looking at pay porn sites.

Let me state a couple of assertions :
  1. the semantic web is a medium
  2. for a medium to be viable, it must be usable to sell porn.
  3. all successful mediums have been used to sell porn
  4. there is no such thing as free porn.
  5. There is no free bandwidth.
  6. the semantic web is disjoint from Opaque and Misleading Content.
  7. the porn industry needs a Medium tha can be used to Mislead you into looking and clicking thru into thier pages.
  8. the porn industry needs to create a misleading meaning of free, thus redefining the term free to non-free.
  9. the semantic web is to eliminate that possibility of creating misleading content.
Therefore the global semantic web cannot be used as a viable medium for misleading porn advertising. Because it is disjoint with Misleading a subclass of Content.

Of course the semantic web could be used to create an ontology
of porn and be used in local semantic web,
and from that web a misleading html web could be created.

But as long as the semantic web cannot be used as a misleading medium
for advertising pay porn and
the misleading ads being mixed in with the supposedly free content creates
the viability for the medium.
but exactly this mix is what is explicitly excluded from the semantic web.

the result would be a pure porn page that allowed peer to peer exchanging of
porn based on semantic tags,
that would fulfill on aspect. But as soon as you get into the ability to
globally tag all porn
then the issue is that most of the porn is bad. Not only is the attribute
free misleading, but also the term good.

so in the end, the isps/search engines cannot bite the hand of
it low quality porn industry that is feeding it,
and will never support the semantic web for the customer fully.

In fact, this brings me to the conclusion that the customer will
always need to be decieved for advertising to be received,
and that for a medium to be successful it will always need to
contain opaque and misleading data in it.

I would like to suggest the following
namespace porn with the classes porn:OpaqueData and porn:Misleading as subclasses
of porn:FreePorn and porn:GoodPorn.

You can run the ontology on pellet and it will prove that the semantic web is not satisfiable

Here is a nice interactive view of the ontology

basically I state that the semantic web as a medium is disjoint from advertising.

pellet say also :
B:Disjoint Classes axiom found: DisjointClasses(SemanticWeb Opaque)
Disjoint Classes axiom found: DisjointClasses(SemanticWeb Medium)
Or: unionOf(Misleading Opaque Advertising)

I look forward to some feedback! please send me your comments

mike

Monday, June 12, 2006

Project Management and Free Software

I have been reading about project management in this nice book
[http://www.amazon.de/exec/obidos/ASIN/3455094732/028-8254887-6504516] Project Management für Einzelkämpfer.

It describes how to avoid feature bloat and reduce the scope of your project to the most important things.

This is great advice, and I just wanted to cover some of the issues with using free software.

Lets assume for this moment that you have a task to do, and you have decided to use Free (open source) software. You dont really want to spend time working on the software itself if you dont need to. But lets assume that you have the resources to do this in your team.

First of all, just getting the software to work is a exercise in distraction. Configuring, Compiling and Testing the software is just one task. But what about selecting the right package from the available ones. Or having to use functions from many incompatible parts.

These tasks are in themselves distractions from the project goal.

Now the real issue is the loss of control. The amount of dependancies that a software package has
is not always obvious from the beginning. Just getting the latest version and compiling the software, brings in many new variables into the equasion. How can this be planned and measured?

So, really you get a field full of landmines that have to be defused.

Now look at the number of file formats, and the cost of hooking up the programs to each other.
at last when you want to publish your results you will need to produce nice and easy to consume reports with tables.

So, What I propose is a simple introspector framework that collects all the input and output formats of all the software by intercepting the IO calls and the stacks around them. Then we can mark the memory that is the source of the outside data. Then follow the control graph of the assembly. We mark all the nodes that it travels through. This graph contains test data extracted from profiling the testcases and benchmarks. So we need a real time profiling tool that is capable of memory profiling and association of the profile paths with the data traces.

This will finally lead to a point where the data is emitted. There we collect the calls to output and note the marked memory, as to where it came from. I want to summarise the metadata with an added integer or long that represents an index into a table of paths.


more to come

mike

Friday, April 07, 2006

Human text written in perl mode

Here is something I have been experimenting with, representing my thoughts in perl syntax .

RDF ->TESTS
generate tests of the rdf model

TESTS -> RDF
extract rdf model out of tests

RDF -> HUMAN
Read the rdf into a human mind
introspect on visual pattern matching
introspection -> INTROSPECTION MENTAL MODEL -> VISUAL MENTAL MODEL
-> UNDERSTAND DATA COLLECTED;

INTROSPECTION MENTAL MODEL ->

HUMAN -> RDF
write rdf
write patterns matched

printf
type string, integer, float, constant string
variable, constant, in string
count 0,1,2,3
sources =>{
"local variables" => "declare the variables in the function body",
"parameters" => "add parameters to the function"
}

TEXT IN PERL MODE
=> gives you useful indentation model
=> represents this document
=> POST TO BLOG => sub {

},
=> {
NAME => PERL,
CONVERT TO TARGET LANGUAGE => {
NAME=> C ,
CONVERT TO TARGET LANGUAGE => {
NAME => asm,
},
method => [compile it, and then check the errors, parse the errors,
look at the types of errors,
extract the variable data in the error message,
fix the problem by inserting the missing data.
repeat
]
}
}
=> sometimes needs a terminating ;

Tuesday, April 04, 2006

Tips and Tricks using the GCC, CPP and Binutils

For the http://www.lug-salem.de/ I am preparing a short presentation for showing how to use the gcc and cpp and so for collecting information.

the version I am using is :
gcc (GCC) 4.0.3 20060304 (prerelease) (Debian 4.0.2-10)

the general idea is to specialize the information more and more, adding in more constants.
By dealing with the output of the preprocessor we can get a concise overview of the source code in one file. By looking at the assembler, we can see all types of information that is otherwise hard to find.

Here is the outline:
  1. preparation
    1. GNU/sourceforce/debian/cpan/google/redhat/
    2. documentation
    3. mailing list
    4. unpacking the project
    5. looking through the files available
    6. configuration and debugging m4, shell, sed, grep,test and friends
    7. aclocale, automake, autoconf
  2. modification of the makefiles,
    1. turning on the verbose mode and save temps in CFLAGS
CFLAGS = --verbose -save-temps
CXXFLAGS
  1. CPP and various options
    1. checking how to run the C preprocessor... gcc -E
    2. Macro Bodies
    3. MACRO DEFINITIONS
    4. non executed blocks
    5. DEPENDANCIES
  1. compilation with the gcc, what are the passes.
    1. CPP
    2. LEX
    3. PARSE
    4. AST
    5. RTL
    6. BACKEND
  2. What dump options are available
    1. CC1
    2. I files /usr/lib/gcc/i486-linux-gnu/4.0.3/cc1 -E -quiet -v -I. -I. -I.. -MD device.d -MF .deps/device.Tpo -MP -MT device.o -MQ device.o -DHAVE_CONFIG_H device.c -mtune=i686 -fworking-directory -O2 -fpch-preprocess -o device.i
    3. S files /usr/lib/gcc/i486-linux-gnu/4.0.3/cc1 -fpreprocessed device.i -quiet -dumpbase device.c -mtune=i686 -auxbase-strip device.o -g -O2 -version -o device.s
    4. tree files
    5. RTL
    6. flow graphs
    7. MAP FILES
  1. binutils
    1. NM, OBJDUMP, ReadElf for getting at the results
    1. Finding out the sizes of objects
    2. finding names of functions out of the addresses
    3. unmangling names
  2. using and scripting GDB for debugging and data collection
    1. stopping the command immediatly with a kill sig stop
    2. scripting the gdb
  3. Creating and Dealing with core dumps
    1. Ulimit
    2. debugging without debug information (map files and objdump)
    3. libbacktrace
    4. mapping OBJ files to ASM
  4. Doxygen and Co
  5. GraphViz
  6. Profiling, gprof, cache grid, memory profiles, strace, oprofile

Tips and Tricks using the GCC, CPP and Binutils

For the http://www.lug-salem.de/ I am preparing a short presentation for showing how to use the gcc and cpp and so for collecting information.

the general idea is to specialize the information more and more, adding in more constants.
By dealing with the output of the preprocessor we can get a concise overview of the source code in one file. By looking at the assembler, we can see all types of information that is otherwise hard to find.

Here is the outline:

1. unpacking the project

looking through the files available

2. configuration and debugging
m4, shell, sed, grep,test and friends
aclocale, automake, autoconf

3. modification of the makefiles,
turning on the verbose mode and save temps in CFLAGS
creating of new rules

4. CPP and various options
DEPENDANCIES

4. compilation with the gcc, what are the passes.

4. What dump options are available
I files
S files
tree files
RTL
flow graphs
MAP FILES

5. binutils NM, OBJDUMP, ReadElf for getting at the results
Finding out the sizes of objects
finding names of functions out of the addresses
unmangling names

6. using and scripting GDB for debugging and data collection

7. Dealing with core dumps
stopping the command immediatly with a kill sig stop
debugging without debug information (map files and objdump)
libbacktrace
mapping OBJ files to ASM

8. Doxygen and Co

9. GraphViz

10. Profiling, gprof, cache grid, memory profiles, strace, oprofile

Tuesday, March 21, 2006

introspection as a mental process

Let us look at the human mind as the most expensive processor imaginable.
The IO is very very slow and error prone.
It is however the best pattern matching server we can afford at the time.
So, the process of pattern matching needs to be augmented.
The introspector will need to collect the data from all types of data sources,
and it will need to do so quickly. Therefore it is important that datasamples can be collected and classified. Imaginable is an firm grasp of the gcc toolchain and using that metadata collecting data that way. The metadata is then published for all to use. This would include all byte ranges (Programs(Functions(Blocks(...(Token(Chars(Bytes(Bits(Meaning)))))) of source code with all semantic data attached to define the meaning of the source code. Each statement of meaning is a signed declaration from a sender, and only when that statement has been evaulated and its contents accepted by a different reader, you the reader that is, Or even an indexing system.
Such an indexing system is one of the major goals of mine for the introspector. The idea is simple : Given a model that is completely understood, ie source code of the compiler, we can model any data expressed in that language that we find on the internet via google et al in a page against our model of the language. This will produce a semantic subset of the introspector system, the current set of knowledge that we have about the subject program that we are introspecting.
Thus, a full introspection could be viewed as one mega file local portable net meta(cvs tgz google(mail) mbox) search that queries each resource in each context and builds pages of data for workers to receive and process and analyse layout present review search graphing diagraming.
All of these applications are available under linux . If we can introspect them via the gcc and gather semantic information about them then we can parse those pages and align them with introspector resources presented. Included in the available introspection data provided will be audited samples of the various output files that are traced against the metadata and expanded with metadata in a rdf format. Basically each bit, byte , token, of data that is of any atomic value is treated as an rdf resource in terms of a gcc datamodel. This is available from the gdb as well.
An introspector interface into the gdb would be of great value.

Listening to Erick Sermon Marvin Gaye - Just Like Music

So the idea is collecting these samples of data that is for the human mind and indexing it via the gcc. Each and every relevant resource or configuration of resources that are described in a program serve as the source of a query into a gigantic database (google et al). The results are used to find metadata about the program. By joining the searches , or the results of them, we look for common pages and relationships between them.

Also, now here is an important point :

The testing of this data, and the statements of predicates about those tests can be widely automated, but the final driving force in the human mind and therefore we need to build the best human interface so that the user can drive the introspection process comfortably.

Monday, March 06, 2006

ideas for the introspector

be able to import RDF and annotate database tables with rdf information.
be able to reference a table in the database , a field, etc by describing the sql with rdf and mounting it as rdf database source that is usable in rdf.
Be able to attach a rdf edit control into existing applications.

Monday, February 20, 2006

updated ontology

I have updated the old introspector ontology,
and have made in more standard. Will be updating it more.

Raw N3 that will be processed by CWM : introspector.n3
Processed N3: introspecter_gcc_cwm.n3
Process RDF for postprocessing


Object Viewer
DumpOnt

Friday, February 17, 2006

introspector-gcc.0.1

This is the first release. see the blog for docs. This is the first release of a new gcc introspector implementation. It uses a new directory structure as the output and finally you can use textutils and perl to process the asts! I have converters from this directory structure to a html page in a tree structure, albeit very simple.

I need to update this and add more information on how to build it.
it will only run on i386 for now. Run make in one of the gcc subdirs. ignore the toplevel makefile.

Download from sf.net :

Downloads from objectweb :

Thursday, July 07, 2005

vcg bary rewrite underway

I have been working on rewriting and decoding the bary routines from vcg.

here is the snapshot : http://introspector.sourceforge.net/2005/07/bary-rewrite-0.1.tgz
56cde10020c7700cbc16f0e7074309fa

unwind introspector

Long time no blog, because I have been offline for months. Now I have a DSL connection and can publish some of my files.
  1. Created a new libintrospector that is part of the gcc-4.0
    1. it is not finished, but a work in progress.
    2. Started with the printf introspection, replaced all the printfs in the gcc with a new printf introspector function. This will be using the unwind introspector to create an intelligent stacktrace.
    1. Removed the dependancy on raptor and redland,
      1. There is not a need for the full redland and raptor functionality in the gcc core for now.
      1. replaced the implementation with empty stubs.
      1. will be able to store the rdf data in dwarf2 format and later convert the full dwarf2 data into rdf.
  2. Started on unwind introspector, a new implementation of lib unwind for the that includes a better dwarf2 support.
    1. Extracted the routines from libunwind that are needed to only decode the stack.
    2. Made a simple method for converting the dwarf sections into data sections that are loaded into the image. this simplified access to the dwarf data and eliminates the need for libelf. Moving the dwarf decoding routines into the program.
You can find the first steps here:
03feb06be7c1756d53dd34c5e35a92db http://introspector.sourceforge.net/2005/07/gcc-4.0.0-introspector-0.1.tgz

unwind introspector:
85616411cfa501bf089d0a2744e3c7c0 http://introspector.sourceforge.net/2005/07/unwind-introspector-0.1.tgz

Wednesday, May 25, 2005

gcc 4.0 patch instructions

Dear All,

I have decided to patch the gcc 4.0 and finally produce a clean release of the introspector for popular usage.

The code will be available as a replacement for some files in the gcc-4.0.0 source. I am working on the patches right now, so dont expect it to work yet. Of course you can get the prerequisite packages and test them.

Here are the steps that I needed to do to prepare the introspector:

Install and build the gcc 4.0
  • wget ftp://ftp.cs.tu-berlin.de/pub/gnu/gcc/gcc-4.0.0/gcc-core-4.0.0.tar.bz2
  • mkdir gcc-4.0.0/introspector/
  • cd gcc-4.0.0/introspector
  • ../configure --prefix=/usr/local/introspector --enable-languages=c
    • For now we will only use the c language
  • make
We should have a basic gcc there.

Now we go into the gcc subdir, patch the files from the cvs

  • cvs -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/introspector login
  • cvs -z3 -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/introspector co -P gcc-4
Add all those files to the gccgcc/
  • cp gcc-4/* gcc-4.0.0/gcc/

Also, we want to get the raptor and redland libs
  • Raptor
    • wget http://download.librdf.org/source/raptor-1.4.5.tar.gz
    • tar -xzf raptor-1.4.5.tar.gz
    • cd raptor-1.4.5
    • ./configure
    • make
    • make install
  • Redland
    • wget http://download.librdf.org/source/redland-1.0.0.tar.gz
    • tar -xzf redland-1.0.0.tar.gz
    • cd redland-1.0.0
    • ./configure
    • make
    • make install
  • Redland Bindings
    • wget http://download.librdf.org/source/redland-bindings-1.0.0.2.tar.gz
    • tar -xzf redland-bindings-1.0.0.2.tar.gz
    • ./configure
    • make
    • make install

Friday, March 25, 2005

What is readable source?

What can be considered readable source code? What freedom do you have in expressing yourself and calling that source code?

The FSF defines the four basic freedoms of source code

I feel that freedom #1 "to study how the program works, and adapt it" to my needs is a more important basic freedom than freedom #3 for you to "improve the program, and release your improvements to the public, so that the whole community benefits".

Because the "improvement" could be to create derived works that preevent me from reading your improvement, this is what I term as uglified source code.

The GPL allows authors to distribute software that they are the sole author of that is uglified (not in the preferred form for editing) at will, with no punishment. At least the University of Saarlands is doing so with the stated reason of making it hard to read and understand. There is no limit to this is seems. Or is there?

Your creative expression in uglifing your software, distributing it the non preferred form for editing, can limit me from reading the source code.

This is bad in my opinion, and I would like to prevent that from happening to my software under the GPL, if I cannot prevent the proliferation of uglified code in general.

It also means that you should not be able to use my readable sources to create software that is not readable, This should be preventable by the GPL.

Freedom has its limits and there are some freedoms that are more important to the public interest than others.

The GPL in Section 3 states
"The source code for a work means the preferred form of the work for making modifications to it."
Would it be possible to modify the GNU public license to add definitions for uglified code? It seems to be impossible to prevent uglified code, but it should be possible to lay down some guidelines.

Here are some suggestions for some definitions, and they would not limit the creative expression of an author.

1. This code may not be uglified, except by the copyright holder. Uglified means it is generated by some automatic tool that changes the code that is edited by human. The results of the uglification process are not the preferred form of editing. The uglification process is done to take away the ability of a user of the software to read and understand the sources. If some tool is used to process the source, then all the inputs to this tool must be included and the tool must be also included. The uglification process is an automatic process where the original sources are not distributed and the uglifier software not distributed. the uglified source is a derived object and can be considered to be like a binary file.

2. This code may not be combined with uglified code, unless by the copyright holder of all parts. Users may not create derived works that include other peoples code with uglified to create a derived work.

Rational :

Because of problems with the university of Saarlands releasing uglified code under the gpl, code that was modified to be unreadable, I want to make sure that the software I write cannot be included in such a tool. The GPL does not prevent people from creating uglified code. But I should be able to prevent someone from adding obfuscated code with my code and creating a derived work.

I see this a conflict between the freedom to expression and the freedom to read and understand.

The original author of the software can distribute the original source in a obfuscated form, in a way that is automatically converted into something that is really hard to read and modify, and not even the preferred form of editing, there is nothing anyone can do against that.

When it is no longer the prefered form of editable source code, then it becomes a difficult issue because the copyright holder does not sue themselves for violation of the GPL.


Feedback :

Thanks to Alfred M. Szmidt (AMS) for his criticism and advice.

Thanks to S11001001 for pointing out that a new license might not even been needed : I dont even know if a new license is need, or if the GPL needs to be clarified in this case. But I do still want to tell you my idea, maybe it can be used to create a more watertight definition of preferred form of source in the GPL.

Thanks MarcusU from DotGNU for Spel Kheking.

References :
Rusty's thoughts on the claused in the GPL here.
This is also the topic of discussion in the LKML.
The debian policy makers have voted on the topic of the definition of source code here
This topic was discussed on debian legal as well in great detail.

The GCC supports VCG output, but it is also an issue that it is obfuscated
mentioned here
Look for example at vcg.1.30/src/step1.c for an example of the obfuscatoroutput. This is not source within the meaning of the GPL. A strict view
would say that given a GPLed program without full source, we cannot
distribute it at all; even with a less strict view that the authors
intended this version to be distributed, distributing a program without
proper sources from a *.gnu.org site seems dubious.

Loic Dachary Mentions that we are not allowed to apply the GPL to the VCG at all because it is obfuscated
I'm having a problem related to the distribution of VCG, aspublished at http://rw4.cs.uni-sb.de/users/sander/html/gsvcg1.html. Although VCG is published under the GNU GPL, it contains obfuscated source code. As a result, I'm unable to redistribute it because I would violate the GNU GPL that states that the sources are defined as "the preferred form of the work for making modifications to it".

Friday, February 25, 2005

Removed Text from the Introspector Lightning talk at FOSDEM 2005

Here is the material that did not make it into the Original Speech.

Involving the human mind in the process of introspection


One of the major tasks that I see in this process of understanding code is the involvement of the HUMAN MIND, the user.

I think that by feeding information about the software to the visual cortext via the eyes, or by whatever means that might be used by disabled persons, then the minds natural pattern matching and model building process will take over. When the mind is able to then pose new questions to gain more information to the introspector system then the viewpoint of the visualization system is focused on newly selected topic.

The mind will then focus on interesting aspects. The next step is to allow the patterns found to be captured and fed back into the tool. This creates a feedback loop where the meta programming tool is guided activly by the mind exploring the software.

An meta programming tool will be then successfull when it allows the programmer to directly, naturally, and efficently access the data collected out of both the software and the context of the software.

Operations on the data in the form of Structures, Lists, Trees, Graphs, Relational databases, Vectors in Memory and a simple text files. All of these forms of data are needed to allow the programmer to chose the right access method to attack the problem at hand.

Of course GUIS will be of value, and visualization tools that can layout and filter graphs will of use. But these tools need to be secondary to the goal of raw access to the data. All of this data needs to be accessed via . I personally think that the graph layout algorithms can be applied to data structures to optimize the memory of them.

The conclusion is that the introspector needs to be as slim as possible and as efficent as possible in providing useful information to the programmer. But it needs to be as open and usable as possible, providing the redundant representations of the meta data so that it can be exploited.

The Context of programming

The idea of context is difficult to define in general for meta-programs, because you have a meta-context! The context of a meta-program is related to all the contexts of the object-programs that it operates on.

Because of the idioms and the style of the programmer, the important data about a program can be encoded in a unique and programmer dependant style. This style or character of the code enbodies is the essence of the coder. Because of the seemingly unlimited expressability of a programmer, there is no way to dictate how a particular idea will be encoded. Naming and Style Conventions, Coding Styles, and Documentation contain context specific infomation that is needed to understand the code.

To make the problem worse, the Dreams and Visions of the programmer, Conversations between programmers over coffee , Unwritten assumptions, and Cultural Background plays a role in the style of code written.

Programming is Communication

Writing code is a form of formal communcation! When you view code as a message, then you can open you eyes to the interpersonal and social aspects of code that aid in its understanding.

The act of writing code has at least four aspects :
  1. Communication of instructions to the compiler (and other meta-programs) and finally to the computer for execution. So, in the first step, you write programs for a computer. You are communcating the instructions of how the object program is to execute as the real job of a programmer. The Programmer communicates with authors of the meta-programming tool via thier Agent, the meta-program.
  2. Communication of concepts to ones future self. The second step is to write a program so that you might be able understand and reuse the your mental state at the time of writing, the communication of the concepts to yourself.
  3. Communication of concepts to other programmers, and third parties who might use or even further develop your code.
  4. Communction of meta-data back to the programmer in the form of feedback to the programmer. Compiler error messages for example.
Intercepting Communication is one of the main goals of the introspector

The interception of that communication and its decoding by a third party is the next step when the code is taken out of the context of the original message to the Computer Chip.

The problem is that an outside person, will not be easily able to fully understand the captured message exchanged in a closed context with no external reference information.

So we have set the scene now for meta programming : People creating tools for thier own usage as messages to themselves and a small user group and others trying to intercept those messages.

A program is a message. Understanding a program involves decoding that message and recoding it into your context. Usage of contextual information outside of the code itself is often needed to decode the message. The introspector allows you to collect this reference data in a central repository and supports the understanding of the message.

Examples and classes of meta programs

Some examples of what I consider to fall in the class of meta-programs are :
  • compilers, translators and interpreters are programs that process and execute other programs
  • Custom User Defined programs that are written by users to process the software
Programs that affect and control the process of creating the software
  • build tools like Autoconf, Make, Automake, Ant that control the compilation and build process
  • I dont consider tools that are just used in the build process to be meta-programms even if they can be used to implement meta programs, because they are not dealing with the software directly such as Grep, Bash, Sed, and more trivally Tar, Gz and the Linux Kernel. These programs however contain important meta-data related to the program and will need to have interceptors installed to collect that data.
  • Tools that deal with software packages like dpkg, rpm and apt can also be consider to be meta-programs because they are providers and consumers of meta-data about the software.
  • Linkers, Assemblers
  • optimization routines of the gcc
User Space Run Time Functionality
  • The reflection mechanisms of java and the eval function of perl
  • Dynamic languages such as Lisp, Prolog, Haskell, to some extent Perl C# and many other advanced languages that have direct support for meta-programming
Profilers and runtime optimization routines
  • Profilers and Data Collection routines
  • Dynamic Linkers
  • JIT tools and partial specialization routines
  • Process Introspection and snapshoting (Core Dumps included)
  • The GDB debugger
Code Generators
  • Language Creation tools such as Yacc, Lex , Antlr and TreeCC the
  • program transformation tools like refactoring browser tools, aspect oriented browsers, generic programming tools
Programs that extract information from your code and deliver it to the user
  • code validation and model checking tools such as lint and more advanced model checking tools
  • reverse engineering tools, case tools, program visualization tools
  • intelligent code editors and program browsers that have a limited understanding of the code (emacs falls into this catagory in the strictest sense)
  • automatic documentation tools like Doxygen
  • Even IDES can be considered strictly a meta-program, or at least a container for them.
  • of course, I would consider my pet project, the introspector a meta program.

Metaprograms are like mushrooms, they sprout out of dark, damp and dead parts of existing code

The one thing that I have observed is that very many meta-programming projects just spontaniously sprout out of the ground, each has a similar goal, that of processing programs and making programming easier, meta-programming. Most such programs are not reusable or reused, and they mostly do not provide any well defined interface to thier meta-data.

In lisp you have a standard Meta Object Protocol, (MOP), but this is also very lisp specific although well thought out, but on the other side, there is a huge amount of meta-data in lisp that does not have a standard well defined interface into it.
The more context specific a meta-data and a meta-program is, the more effective it is for the context it is created for, the best example is an assembler or compiler optimized for a specific processor. There are a huge amount of research and experimental systems that provide various degrees of freedom to the programmer and user.

For the most part, meta programming tools can be classified into three sections :

1. So context specfic so they cannot be generally reused and are generally disposable. They sprout out of some concrete problem and are just like mushrooms that grow on some rotting material. The scope of the coverage of the fungus is limited by the scope of the problems in the object-program.

2. So abstract and complex as to be not easily usable, understandable, or practical. The context is artifical, abstract and mathematical. This is a different form of being context specific, the context is the mind of the author or his limited slice of research. This is a classical example of a message from the programmer to himself that I will explain later, and lacking any reference to the outside word.
3. The few rare cases are pratical tools that find a safe mix between abstraction and context. The C language has a very small set of abstractions, and the GCC has been able to define routines that are reuable between various languages. The problem with these pratical tools is that are in general lacking any of the advanced meta-programming features that are found in the previous two classes.

Metaprogramming tools normally dont work together or and for the most part they dont work for you

For the average programmer working on an average system, very little is available for thier usage. When you sit down to work on a normal programming task, lets say one associated with working on the source any of the GNU tools, there are basically no standard, integrated and usable meta-programming tools that you can use for all aspects of your work.

There is very little in terms of a standard interface or set of requirements that are placed on meta programs in general. This is due to fact that programming is a form of formal context specific communication that I will explain later.

Metaprogramming tools are disposable

Meta programs are tools that are for the most part disposable. Thier effects result in bugs being found and fixed in such things as validators. Or in documentation being produced. Or in code being generated. The programs themselves interact with the programmer via configuration files or a GUI or via individual commands. The programmer guides and controls the meta-programming process. So in the end, the metaprogramming tools are only as good as they are usable by a programmer. They are only as good as they are applicable to a given problem.

The set of the meta-data for a given program is very large

The compiler is a meta-program that contains a large amount of data about the software at hand, but there is a large set of programs that make up the build process. Luckily for most interesting programs the source for all these programs are available. So all of the tools that are used to direct the build of the software can be considered meta programs that affect the final object-software. If we look at all the data that is contained by all the instances of the meta-programs then we define a large set of meta data that

All these tools, when considered together, use and process many aspects of the software. So we can say that the total amount of data in memory at all points of the process of the running of the meta software contains a very good picture of the software that is being compiled. Now it is the question, how can we get the meta programs to communicate this data to us?!


Recoding the message into a RDF with an explicit context

Now, once that a program as been understood, it can encoded into a context independant representation, like RDF with explicit context, relationships and meaning.

RDF means Resource Description Framework. Resources are things of value that are worth identifing and describing. Every single aspect of the software can be represented as a graph. The Nodes in the graph are resources or literals. The edges are called predicates, they can represent pointers, containment or basically any binary relationship between nodes. In RDF each type of edge is another type of resource and can be defined in detail.

We can assign a unique resource identifier in the form of a URI to each identifier, variable, each value, each function call of the software on the static level. By adding in the concept of a program instance, time and computer we can also assign resources to dynamic things like values in memory, function stacks and frames.

When this model of the program has been started to be built, then the communication in the form of Documentation, Emails, Bug Reports, Feature Requests and Specifications about that program can be decoded, because it will reference symbols in the code. Or the code will reference symbols in the communcation.

Now, the symbols that occur in the source code could be constant integers, constant strings, identifiers in the code, or even sets and sequences of types without names.

So, the first step to decoding a program would be index the set of all identifiers. Then determining the relationship between the identifiers and the concepts is needed. Mapping of names onto wordnet resources would be a great start. The relationships between identifiers needs to be discovered.

By transforming the source code into a set of RDF statements that describe it, and also converting the context data into a similar form a union of the two graphs can be created. Relationships between the two can be found.


Application of Meta-Data to the Interceptor Pattern

If the meta program is changed so that it emits this data in a usable common format, then this data can be put into context and used to piece together a total picture of the context of the software. This is what I call the interception pattern. The message between the programmer and the machine is intercepted and recorded. There needs to be a common API for this interception. There also needs to be tools for automating this interception. That can be done by the usage of the meta data collected from the compiler and the build tools in the first pass. By decoding the data structures of the build tools we can semi automatically create serialization routines. By applying the techniques described here, each program can be trained to communicate its meta data to the introspector. Each program that is hooked up to this framework increases the knowledge available for the integration task.

The idea of the semantic printf function

The next idea would be to replace the printf routines with a general routine to query and extract the data that is available in the context of that printf. Given that we will have access to a list of all the variables available at any given context, and that we will also be able to know any variable that can be directly and indirectly accessed from that variable, it will be possible to invoke and process user specified extraction and interception code that the point of the printf. The printf could reference the point of the meta-data giving each variable to be emitted a very detailed context.

The data that we need is there, we just need to get at it

As the user of a meta program, you often feel that you are a second class citizen. Yes, well that is the core problem that I am addressing. Most programs are written to solve a problem for some person. The fact that you are using it is secondary. The gcc compiler itself is a good example of a self serving program. It represents a huge amount of knowledge that is locked up into a representation that is highly inaccessable. The fact is that much of the information that the user of the compiler needs and has to manually enter is available to the compiler developers is

Because of the large amount of open source tools, and the fact that all the GNU tools are based on a limited core set of tools all available in source format, they are a perfect target for the collection of meta data. Not only are all the source histories available, but also the documentation, the mailings list, and basically all the contextual information. There is a huge amount of publically available data about the GNU project.

The adding of meta data to C

The history of C an C like languages can be seen as an evolution of meta data and meta programs. Each new addition to the language gives more meta data information to the meta program, the compiler. Each language breaks with the previous version for some reasons, good or bad. In the end you are forced to rewrite your code to use these new features. In the end, the process is just the adding of more meta-data to the existing program and then the interpreting of this advanced meta-data by a more advanced meta-program, a better compiler. There is no reason that this meta information and the validiation of it cannot be added via other means and the processing of it decoupled from the monolithic process. Even the addition of meta data about the persistance and the network accessibility of software via DCE IDL and Corba can be specified in the same manner on top of the existing software without new syntaxes.

The reading of introspector augmented meta-data back into the meta programs.

It is reasonable to consider the idea of reading the instances of the data stored in the meta programs directly out of the introspector. The api that the introspector gives for intercepting the metadata can be used to then read the updated data back out, or even from another source. In this manner, entire programs could be translated from other languages or generated programatically. The entire set of intermediate files and file formats can be unified into a common data representation and communication mechanism. This is possible because the programs to be modified are free software and they can modified to provide this interface. The idea of the kernel module would allow for this to be done without changin the software.

The monolith and the network

The fact that the GCC is linked in the way that it is a organisational, political and socialogical descision. It can be also be split up into many independant functions. Given a mechanism for intercepting, communicating and introspecting the function frames any conceviable network of processing can be implemented without using the archaic linking mechanism used by the existing gcc.

The linker and function frame is a data bus, that can be intercepted


The linker and the function call frame represent a path of data communcation. The compiler produces tight bindings between functions and the linker copies them into the same executable. Given enough meta data about the function call, this data can be packed into a neutral data format and the functions can be implemented in a completly isolated and separated process.

Simplicity and Practicality are the key factors for the success of free software

The great science fiction author Stanislav Lem writes in his (polish to german translation) article metainformationsthoerie [1] that the evolution of ideas computer science is natural selection function that selects ideas by the commercial success of an idea and not by the gain in knowledge. He sites the meme idea of richard dawkins who compared information to genes as self replicating individuals competing for resources.

We can treat free software as a meme and analyse its attributes.

For a free software this success is defined in terms of the following terms
  1. Replication - How often a software is executed (invoked), copied, downloaded, how often the ideas are copied, how often the software is used! We can see that the invokation of a program is the copying of the software into the core of the processor, in the moment it becomes active. We can measure the success of software as the core share of it. How often is it copied into the core of the computer, how often does it become alive.
  2. Mutation - How often a software is changed to adapt to the environment. This is a function of how useful the software it and how easy it is to be mutated into something more useful. The paradox of free software is that the mutation functions are expensive because of the nature of the protection mechanism. Free software needs to protect itself as a meme from being mutated into non free software.
  3. Resources - The amount of work, time, space that is required to use, understand and mutate the software. This is the cost function that is to be minimized. The memes success however is
These factors help explain Richard Gabriel's paradoxical phenoma of "Worse is Better" [2]
(Being that I am from New Jersey, I naturally identify with the New Jersey Worse is better attitude). Simplicity and practicality and interactivity are the most important factors in the success of an idea.

I say that interactivity is important, because it is simple and practical in reducing the costs of learning and using a software. When people are evaluating a software they want to within a very short period determine if this factors are met.

Free software has the paradoxical feature that the source code of successful free software tools are complex, impratical and not interactive. The situation created is that the resources that need to be invested into learning the context of free software need to be so high that the programmer becomes bound to that context and identifies with it.

How does the GPL prevent the usage of meta-data ?

This is going to get hairy here, this is question that I have been thinking about for many years!
The short answer is : there is nothing stopping any program from reading the meta-data of free software.

Reading the meta-data does not create a derived work. The meta-data of a object-program is Copyright covers the copying of the derived works. Of course if the structure of the meta-data is context specific and is a derived work of the object-program.

The solution to this entire problem can be stated as follows :

Any meta-data about a object-program that is intercepted from inside a meta-program in a the foreign program-context can be translated into a user-context without creating a derived work, only the translation routine is derived from the structure of the foreign context.

Because of the amount of data available about free software, open source and even shared source software they are all able to be translated in this manner.

The conflict between free software context and the open meta-data

The user is interested in practicality, simplicity and interactivity. The free software as a meme is interested in memotic success, replication and mutation and the controlling of resources. These two are at odds. Free software tries to protect itself to by making access to the meta-data to be impractical, complex and non interactive. The introspector has the goal of resolving this conflict and making the meta-data accessable by the user.

Conclusion

Source Code is in the end just meta-data that flows in a network of meta-programs. The communction between these meta-programs are handled via primitive mechanisms that inhibit sharing of data.

Via modification of the meta programs, a man in the middle attack can be implemented to intercept the messages from the programmer to the computer, augment this message with contextual information and unify it into a global knowledge base. Given a critical mass of meta-data the messages and data flows of a program can be understood.

This represents an end the existing concepts of using a function creating a derived work for the very fact that the compiler and linker can semit automatically create wrappers, interceptors, serializers and introspection code for any source code that is embedded in a critical mass of meta-data.

This represents a shift in power away from the creators of meta-tools to the users of them and will give more freedom to the users of free software.

[1] metainformationsthoerie http://www.heise.de/tp/deutsch/kolumnen/lem/5443/1.html
[2] Richard Gabriel : Worse is better http://www.jwz.org/doc/worse-is-better.html

Thursday, February 24, 2005

Lambda:Rule idea

[a lambda:Rule;
lambda:rule set_homepage;
lambda:args (:nick,:uri); lamba:string "^addturtle [a foaf:Person; foaf:nick :nick; foaf:homepage :uri]."]


where lamba:rule is the name of the rule, lambda:args is a list of the args and lamba:string is the string to replace the args with. That could be used to define the rules in turtle

Wednesday, February 23, 2005

Introspector Lightning talk at FOSDEM 2005

Speech for 15 minute short lightning presentation on the introspector on Sunday the 27th of Feb. at the FOSDEM.

Because I have problems with timing my presentations in the past, I have decided to write a script for my 15 minutes to make sure that we get the most information packed in as possible.

After reviewing my material, I have discovered that there is enough material for at least an hours presentation. I have moved it out to my blog and you can find it http://rdfintrospector.blogspot.com/2005/02/removed-text-from-introspector.html

Introduction : 1 minute

Hello all, thanks for showing up today to listen to my presentation. I would like to talk to you about something that I have been obsessed with for years : the true nature of programs that process other programs, what I would like to call "meta-programs".

Because of the time limitations on this speech, I will not be able to take any questions, or be able to go into much detail. The purpose of this presentation is to state once and for all the scope and purpose of the introspector project and call out for support from the free software community.

Let be start by stating with my personal historical motivations and the core questions to be answered by the introspector project, then get right to the core ideas that I would like to imprint upon you while providing definitions for the terms I will be using.

I will not be able to presenting supporting details for my theory, and or present a full the history and current state of the project because of the lack of time. I have however included it in this paper for your review and look forward to discussing it with you after my lightning talk if you are interested.

I think it is more imporant to understand the scope and setting of the introspector than to understand how it is currently implemented. The point is that introspection is a mental process, it is a way of thinking more than it is a software.

The Original Motivation : RAD 2 minutes (3rd minute)

When I first started learning computer programming as a teenager in the 80s, I was drawn to the ideas of turbo prolog which I played around with, but never really could make use of it. What I did make use of however and become fascinated with was the DBASE III system which was widely used at the time.

The thing that made DBASE so attractive was that it is so simple, practical and interactive. I followed the evolutionary path of these simple database systems from DBASE III, to DBASE IV, to Clipper, to Borland Paradox, and finally to Microsoft Access, I became convinced of the power of simple database solutions.

RAD (Rapid Application Development) was one of key ideas of the 80s, and I was imprinted at an early age by this idea. The Usage of Screen Painters, Simple interactive development environments, Program Generators and Reporting tools were the keys idea of RAD.

Later when I started to seriously program in C and SQL I was disappointed with the amount of work and resources that were needed to be put into creating the same simple functionality that was available in DBASE III! I longed for the a way to be iterate over the fields of a record in the simple manner that you could do in DBASE. This functionality was key in allowing for the creation of screen painters, report generators and all types of really useful programs. In short, RAD!

I was deeply interested in all types of tools to make this work simpler, and looked into Case Tools, persistence toolkits, and in the end, wound up writing my own program generators for C and C++ from the very beginning that emulated the best parts of what I had with DBASE!

OEW: or why write your own parser ? 1 minute (4th minute)

I worked for Innovative Software back in 94, now called IS-teledata. I was attracted to the now discontinued program OEW, the object engineering workbench, a c++ round-trip computer aided software engineering (case) tool. It could parse out your C++ code, allow it to be edited in a simple self-styled diagram (this was before UML, and the fact is that Booch's clouds at the time were just too complex to draw!) and it could finally regenerate the new code right back out, producing documentation and reports. They had a lossy C++ parser, it could not handle all C++ code you threw at it. My question at the time was, "why don't we just use the GCC compilers parser"? I had then gotten a copy of the source code of GCC and tried to read it! I was LOST in the complexity of the code. There was no way that I could make sense out of it, I did not even know where to start!

This however was the second key motivational idea behind the introspector project. And now, 10 years later I have started to answer that question.

Part of the answer to that question is a second question if the GPL can prevent the usage of the parser by another program ? The short answer is there is nothing preventing this from happening! What if the parser were to emit all the data that it contains about program at hand into a readable format? I will try and that question in more detail later after I define my terms.

Why doesn't the Compiler have a public data model and an external representation of it data? 1 minute (fifth minute)

The next question that had to be answered was why there is a lack of a model from compilers internal data! The OEW tool was also lacking this feature. At the time I wanted an API into the OEW case tool, a way to get at the data, so that I could create a RAD like tool for C++ and have the features of DBASEII!

This was the key problem that prevented me from proceeding on many levels. My answer to this question is presented here : The model of the compiler data is really the same as the model of communication itself, this communication is context specific, and between the programmer and an software agent working on behalf compiler writer.

The attempts to define standards such as MOF (Meta-Object Facility) and XMI (XML Metadata Interchange) show how it is very complex and impractical it is to define the model of the metadata of software. The semantic web project is the best attempt that I have seen so far at being able to capture and annotate the models of software. That is why I am using RDF and OWL as the basis for the storage of the data of the introspector.

Core Ideas

Here are the core ideas and definitions of the introspector project, If you leave this presentation today with these imprinted in your mind, then I will have been successful :
  1. The introspector is a pattern for the behaviour of the programmer, a process that is applied to your software with assistance of the introspector tools.
  2. All software programs, source code and binaries are messages from the author of them to other people and agents that represent them. In the end the processor, the chip, is an agent that represents the chip producer but is acting on behalf of the owner of the chip. The chip is communicated with to be told what to do by the programmer. The compiler is an agent that acts as an intermediary between the author the program and the chip itself. The programmer produces software that acts an agent that is told what to do by the user and then translates that message via a network of messages to the chip, while the software is running. (1 minute. 6th minute)
  3. Communication is context specific. The language that the chip understands is deeply tied to the chip itself. Communication with it is context specific, it requires an understanding of the current state of the chip and the computer system to be efficient and effective. In addition it is dependant on the wiring of the chip and the features and functionality provided by it. Communication with compiler is also context specific, it provides a simple layer of abstraction above the chip itself, but it is not able to fully distance itself from it. The program itself is also written in a context and executed in another one. All of these contexts are different and communication between parties and agents in separate contexts is inhibited by the accidental complexity occurring when translating the message between two and more contexts. (1 minute. 7th minute)
  4. Meta-Programs are programs and agents that process these messages from the programmer. These agents communicate with each other, in general via a whole bunch of incompatible file formats and data structures, all very messy.
  5. Meta-Data is the data that about software, it is the sum of all the data that is processed and passes through all of the the meta-programs. The Source Code of program can be considered in this framework to be meta-data, but on the lowest level because it is not structured explicitly. Only after it has been processed and split up by the meta-programs does it contain more information and is more useful. This added information is the meta-data that we are really interested in. (1 minute. 8th minute)
  6. Object-Programs are the real instances of the software that is being executed by the user. The binary code of the object-program is itself meta-data that is emitted by the compiler as a message to the chip. The full trace of all the meta-data associated this object-program is defined to be all the data that is used to produce the binary code. A full trace of all the meta-data, of all the messages that were used to produced the object-program during the entire build process is what we are interested in collecting and understanding! (1/2 minute. 9 1/2th minute)
  7. Object-Data is the data that is contained and processed by the Object-Program. This object-data can be partially understood by looking at, and cross referencing it with meta-data we collected about the object-program during the build. But, we also need to follow the trace of the object-data through the program itself! That means we need to know all the data that flows through the final object-program running on the users computer and capture that! If we have all the meta-data about the object-program's build, and we know the entire flow of the object-data though that object-program, and have a trace of the execution of the object-program, then we can begin to understand the structure and the source of the object-data! To collect these traces, we have to modify the object-program and teach it to intercept and enrich the object-data with meta-data, and collect the execution traces, or we need to create a better debugger or even kernel module that can do so. If this proposed introspector kernel module was able to access the full meta-data about the build of the object program, then it could automatically collect and start decoding these traces! But, In addition we will need to capture data about the execution context of the object-program in order to begin to understand data originating out outside of the system, by using documented test cases and benchmarks we can give solid descriptions and meanings to the execution context. (2 minutes. 11 1/2th minute)
  8. Reflection is the process of collecting meta-data and processing it by the programmer or user. It is the basis for writing meta-programs. The programmer needs to be able to query and even update the meta-data about the object-program in order to use reflection to it fullest capacity. Programmer and User Specific Code that is executed at compile time would allow the most powerful form of reflection, the ability to add in new processing instructions and patterns into the compiler itself. This would require communication from the user context or programmer context back into to the context of the compiler. The introspector aims at providing this ability by opening up communication channel between the users and programmers to that of the compiler developer! Now, more simply, Reflective code that is executed at run-time needs to be able to access and maybe even update the of meta-data of the program which can be stored in file or embedded into a shared object. (1 minute. 13 1/2 th minute)
  9. Introspection is the process of a user or programmer evaluating the results of reflection, it is normally motivated by the need to learn about the object-program, or a concrete problem in the object-program or need for a feature. The full set of meta-data, including traces of the object-data, are evaluated in the context of that concrete problem. Ideally the introspection would be started with a input file that describes the exact nature of the results to be gained.
  10. Resolution is the creation of concrete changes to the object program, It will normally result in a set of meta-data describing the new things to implement.
  11. Execution is the final commiting and implementation of these changes to the object program. This includes generation of code, creation of packages, the communication of meta-data back into the context of the build environment.
  12. Interception is the process of intercepting, and capturing the message between two meta-programs or between two functions in a meta-program.
  13. Enrichment is the process of adding in more context data and more meta-data to the existing set of data. This feeds understanding.
  14. Visualization is the process of selecting, focusing, filtering, layouting the meta-data and feeding the results to the visual cortex of the user for further pattern matching.
  15. Understanding the human mental process that involves the visualization of the results of the introspection, and the refocusing of that process on arising open questions until the mind has built an internal mental model of the software. Understanding involves translating between contexts, and the creation of abstract contexts.
  16. The mind of the programmer and user of the final program is what is feeding information back into to the meta-program, and therefore the interface to the user and programmer must be as good as possible. The current form of using many different languages and formats of the meta-data creates accidental costs in the communication. The actions of the programmer can be seen as following some process and program, these actions are then codified in the meta-program.
Thank you for listening to my speech,

I hope that I have explained the motivation and the goals of the introspector project to you. If you are interested in hearing more about it, please contact me at mdupont777@yahoo.com, or jabber me at mdupont@nureality.ca, or visit the introspector irc chat at irc.freenode.net:#introspector

Tuesday, February 22, 2005

Preparation for presentation

I will be giving a short lightning presentation on the introspector on Sunday the 27th of Feburay at the FOSDEM.

I want to give an overview of the presentation, also for me to collect my thoughts and give focus on the important aspects of the project. Here is my current outline :
  1. Motivation and History
    1. Usage of DBASE and code generators
    2. Creating of repetitive code for C structures
    3. Working on OEW, Object Engineering Workbench
    4. Looking into the GCC
    5. Creation of a C++ interface
    6. Trying to create Dumper
    7. Experiments with Prolog
    8. Experiments with XML / Perl
    9. Creation of ICE Cube Prototype
    10. Usage of RDF, Redland
    11. Experiments with modelling DotGNU, Python, M4, Bash
    12. Experiments with CWM
    13. Experiments with EulerSharp
    14. Creation of an Ontology
    15. Creation of new perl interface for dumping
    16. Creation of high speed ICECUBE representation
    17. Clearer definition of project scope
  2. Architecture, Overview and Processing Model
    1. Extraction of Reflection Data from Source Language
      1. Creation of Graph Traversal of internal data structure
      2. External Representation of Graph as RDF
      3. Compression of RDF into a vector of Statements
      4. Reduction of the cardinality of the Subjects
      5. Creation of Subject Types
      6. Finding of Rools
      7. Relating Software on to RDF
    2. Compression of data
    3. Relating of Source Code to Semantic Web
      1. Source Code association to Comments like with Doxygen
      2. Source Code association to Bugs, Feature requests
      3. Source Code association to CVS changesets
      4. Source Code association to Authors
      5. Relationships between Versions of source files
      6. Relationship between Source Code and Specification Documents
      7. Relationship between identifiers and WordNet
  3. Other related Projects
    1. OpenC++
    2. Stratego
    3. LLVM
    4. See others on the WIKI and WIKIPEDIA
  4. Basis Technology
    1. GCC -fDumpTranslationUnits
    2. RDF
      1. Redland
      2. CWM
    3. Perl/Bash/TextUtils
    4. VCG/GraphViz
  5. Future Directions
    1. Lapack/Scalapack/Octave
    2. PostGres/MySql repository
    3. DotGNU interface
  6. Current Progress
    1. Crystalized GCC Concepts
      1. File
      2. Declaration
      3. Type
      4. Size
      5. Chain
    2. New Concepts
      1. Global Statement Vector
      1. Global Subject Vector
      1. Subject Types
      2. Super Types
      3. Roots and Contexts
      4. Relationships beween Context Graphs
  7. How Can You Help
      1. Testing
      2. Documenting
      3. Packaging
      4. Command Line Processing
      5. Better Dump Format

Monday, February 21, 2005

Why is the TV Flashing at me? relaxen und watchen das blinkenlichten

I live across the street from a bunch of TV addicted people living in a Social Housing Project.
My new neighbors on the top floor are a bunch of kids who have a HUGE television.

It is just on the periphial of my vision, and tonight I realized that something was bothering me : THE TELEVISION WAS FLASHING RAPIDLY.

Now, I have been watching the window while collecting data for this article, and can say that it seems that the flashing affect can also take place when the scene changes from dark to light quickly.

That reminds me of the poster from the copy room : Blinkenlights Posters: "relaxen und watchen das blinkenlichten."

The funny thing is that I cannot see exactly what is on the screen, it is at a 10 degree angle to me, but I do notice that it has a stready picture for most of the time. Only once in a while, for short periods is it flashing.

My gut feeling is that the flashing lights will cause a change in the mind of the observer and make them more imprintable for a period of time. A form of Hypnosis.

Googling for the for the idea produced some interesting hits :

Here is a good article on the negative affects of flashing lights :
Spirit of Change Magazine - Good Health is in the Eyes of the Beholder: "The cerebrum is the part of the brain that processes shape, color and movement of visual stimuli. Flashing light, however, is processed in the brain stem which functions primarily through reflex. This vision-induced shift in brain function from cerebrum to stem caused by flashing lights may have serious health and safety consequences far beyond the impairment of our reading capabilities. "

Here are some articles on the Pokeman affect :
The Psycho-Log - The Psychological Community: "'went into a trance-like state, similar to hypnosis, complaining of shortness of breath, nausea, and bad vision . . .'"

The Psycho-Log - The Psychological Community: "In 1994, British commercial television ads and programs were limited to a rate of three flashes per second. The limit followed a 1993 incident in which an ad for noodles featuring fast-moving graphics and bright flashes sparked three seizures."

(Brain of Stig) Cartoon induced sickness, ~729 kids rushed to hospital: "The blame was put on a scene depicting an explosion followed by five seconds of flashing red lights from the eyes of the most popular character, 'Pikachu,' a rat-like creature."

Here they are talking about generators for mind states, depending on the types of flashing, different mind waves can be created.
BRAIN ENTRAINMENT is the reverse of biofeedback.
"Those low frequency electrical brain rhythms which are characteristics of various moods and states of sleep can not only be read out using biofeedback equipment or EEG machines, but using radio, sound, contact electrodes, or flashing lights, the moods and sleep states can be generated or at least encour-aged using brain entrainment devices. Brain entrainment cannot carry voice, which is a much higher frequency range. Brain entrainment can, however, be used to "set up" a target to make him/her more susceptible to hypnosis. These major technology classes can produce some of the observed mind control effects, from hiding, undetectable, with the exception of remote physical manipulation."

Now, here they are talking about different types of brainwashing in modern adverts.
Subliminal Advertising and Modern Day Brainwashing: "Tachistoscope projectors, which can flash words or images onto the screen with a duration of several milliseconds, are used to display the stimuli."

That article contains an interesting list of motivating factors of the mind :
Subliminal Advertising and Modern Day Brainwashing: "human psyche: emotional security, reassurance of worth, ego-gratification, creative outlets, love objects, sense of power, sense of roots, and immortality"

I dont have any conclusions on this topic, but am interested in your feedback.

mike

Wednesday, January 26, 2005

Reflections about the usage of the term reflection in the news today

Along the lines of the a recent article about introspection,
I have started a google alert agent for the terms reflection and to reflect.

Reflection is a much easier term to understand than introspect.

So, first of all, here reflection is used in a introverted way,
the toned down partying of Austrailia day. It is also a reflection on the recent events of the disaster, and indicates that reflection requires resources, time and mental capacity. It is almost like mourning!

Military to reflect on Australia day celebrations
ABC Online - Australia

Australian military personnel serving in tsunami-devastated areas in southern Asia will celebrate a low-key Australia Day, out of respect for the local communities.

Commander Steve Dunning, who is currently in Banda Aceh, says morale has been boosted by the strong public support in Australia.

"[We'll] still be working hard, but we'll take some time out to reflect," he said.

This next article, continues the morning about the past, but in a different way. It combines the reflection with action, so is more a form of decision making, or part of the introspective change process. The decision and action process is guided by the reflection on the past.

AUSTRALIA DAY' TIME TO REFLECT
Special Broadcasting Service - Australia
While non-indigenous Australians around the country celebrate Australia Day, for indigenous communities the day is better known as "Survival Day" or "Invasion Day", and is a time of reflection and action.

Aboriginal people, I think, it's a day of reflection and you think about the atrocities of the past, the slavery, the stolen wages and the endless other human rights abuses, Mapoon Aboriginal Council chairman Peter Guivarra said.

They'll never be forgotten like some of the things I call institutionalised terrorism, but we must learn to forgive, the Cape York leader added.


Here we have an artwork called "reflection", it is a gate, the article is very complex and artsy. But we can extract the basics by looking for the key concepts of looking back at the past. Also, he is reflecting over life, and trying to represent this as a symbol of a gate or transition.

Time to reflect on transition
The Japan Times - Japan

"When I was little, I had to go in and out of this gate every day," the artist explained. "...It is kind of a contemplation on what is reality and what is illusion."

It is these qualities that Suh wants to address. As a Buddhist, he believes in reincarnation, and likens life itself to a transitional space, a passageway from one place to another -- something the ethereal qualities of "Reflection" communicate rather well.
Here is an artwork named reflection.
.

This brings up the entire idea of the representation of the reflection : How can we represent the result of our reflection? What is the structure of it? How much is time dependent and serial, and
how much is eternal? I think that this is something that will need a much more detailed answer.

Here we come to more practical usages of reflection, the bare representing of ideas.
A statement reflects views. The action of making a statement is a decision based on observations and values. Not every statement or representation is as good as another.

Stafford Energy, Inc. Subsidiary Abucco Poised to Penetrate ...
PrimeZone (press release), CA - Jan 25, 2005

... Such statements reflect management's current views, are based on certain assumptions and involve risks and uncertainties. ...
"Stafford undertakes no obligation to publicly update these forward-looking statements to reflect events or circumstances that occur after the date hereof or to reflect any change in Stafford's expectations with regard to these forward-looking statements or the occurrence of unanticipated events."
Here we have a concrete decision that a proposed action does not reflect/represent the priorities of a country. They have decided to not put major funding into a major operation, but put small funding into a small group that will help make better decisions. This is a representation/optimization decision.

UK says science aid should reflect country priorities
SciDev.net, UK - Jan 24, 2005

The British government has rejected calls for a coordinated donor strategy for scientific capacity building in Africa and for a new grant-giving body ...
"... Instead
the government will set up a small working group to assess the potential role and structure of a board that would advise the government on how to fund
research pertaining to international development."

Here are reflections in a critical, self improvement, learning process.

Renault reflect on first day with new car
F1Racing.net - Netherlands

After his first laps in the car, Fernando Alonso commented: "I have a good first impression of the car, in spite of the delays on the first day. ...


This next one uses another word that is interesting word pensiveness



  1. [n] deep serious thoughtfulness
  2. [n] persistent morbid meditation on a problem

This is an interesting term, that is the negative introspection I was talking about before.

Michael Mann
the director of the movie aviator talks about reflections occurring after he got 13 nominations for movies he worked on. It shows you again, how introspection/reflection can be a negative experience.

Oscar noms cause Mann to pause, reflect
Hollywood Reporter - Los Angeles,CA,USA
His joy then turned into pensiveness when he started thinking of "Aviator's" long development history and the challenges it overcame. "(Movies) are all long roads. They're all filled with struggle and that's great. That's part of the adventure of it. But when a moment like this comes around, you reflect back on the history of it. And I don't normally do that. But this morning certainly made me do that."

On that line, we conclude with someone who is not ready for reflection, a man of action.
But this reflection is also a form of self criticism. It looks like he is going through some form of blocking reflection, because he does not want to call it over. I doubt this is the right attitude, but it is indicative of the difference between action and reflection, they seem to be two entirely different modes of thought.

Roethlisberger isn't yet ready to reflect on Steelers' season
" Roethlisberger's gloves were widely blamed for his poor showing "
"It's always been a dream of mine to play in the NFL, even as a little boy, through high school and college," Roethlisberger said. "I do sometimes think about, 'How did I get here?'
"But I don't want to sit back and reflect on anything that's happened. I don't want to reflect on this year, because I think when you reflect on something, it's over."

I hope that you find this as interesting as I did.

Mike

Tuesday, January 25, 2005

Musicbrainz Prototype for a higher performance batch mode processing

Hi there,

I have been protyping some code for a new way to use musicbrainz, in a batch mode. I found that the tagger mechanism is not efficent in terms of computing resources and bandwidth.

Currently, the mb tagger requests information from an apache server running a perl script that looks in a database for each file. This is pretty high overhead. Considering the fact that for the first pass, all that is needed is to know :
  • Is the file corrupt?
  • Is the file duplicate?
  • Is the file in the database?

When you know all this, then you could submit the new information to musicbrains, or then submit and large file that contains all needed information at once.

Otherwise, the database could be queried locally and only the new information could be submitted.

In anycase, when you have 10K of mp3 files, the processing with musicbrainz is painfully slow and the tp_tagger software crashes regulary.

This is what I did so far.

1. I downloaded the mbdump.tar.bz unzipped it, took the track file, used cut and sort to extract a sorted list of trms. then I converted this into a 50mb binary file that is packed very tight. bz gives only 8% improvement. This 50mb file will be easy to download and use.

You can find the packer here :
http://introspector.sourceforge.net/2005/01/trm_index.c
It uses the two files from musicbrainz :
http://introspector.sourceforge.net/2005/01/uuid.h
http://introspector.sourceforge.net/2005/01/uuid.c

Now, I want to use this index to do a merge join in linear time against my database of mp3s. So, I want to extract the TRMS of them into a file that is also sorted and can be quickly read. For this purpose, I have modified mp3info to extract all the attributes to create the trm.
http://introspector.sourceforge.net/2005/01/mp3info.c

It links agains libmusicbrainz, and takes a list of files as a parameter but runs pretty slow about 1000 files an hour.

Today, I took the mp3.cpp code and reworked it to be in c, all in one file and also to read the entire file into memory. It runs about 2x the speed of mp3info.
http://introspector.sourceforge.net/2005/01/mp3.c
and
http://introspector.sourceforge.net/2005/01/mp3_c.h

When it is finished, I will create a simple trm generator based on the output and write the join program.

Mike

Monday, January 24, 2005

Introspection, Warts and All!

Introspector : the process of Introspection, to Introspect

I am a regular reader of news.google.com, and decided to as a joke google the news for the term "introspector". Although it was, not found, looking for "introspect" and "introspection" got some interesting insights into how this term was being used in everyday journalism.

I found an interesting usage of the term to introspect that I kinda liked,
not the fluffy idea of just "looking at your mind", although this can apply to that as well.

It turns out that introspection is seen as some as a brutal honesty, a valuation of your own self with a bent on the negative. It means admitting mistakes, looking beyond the facade. It boils down to a skeptical approach. This allows you to be critical of yourself and promotes change when it does not cause paralysis and depression.

The point is that just looking at something is not enough, the reflection is just feeding you with data. You also need to process this information and feed the decision making process and eventually take concrete action upon it.

The Sunday Times of Malta writes
"We do lack, though, an ability to meditate about ourselves, to introspect. If there are mirrors on the walls of our conscience and consciousness, we do not look into them much"
That is an interesting observation, we lack this ability in general, and if we have it, we don't use it much. This implies that it is a tool or skill that has to be learned and is not something you are born with. I think that it is in some form a type of learned self criticism that some people carry on to the point of paralysis. It can be painful, but it not impossible. Any time you listen to critic from outside and then process this criticism, then you are going through a process of introspection. In fact, introspection is a form of taking the pain out of criticism, because you then internalize the external critic, and identify yourself with it, thus being able to see yourself from outside.

Now, Here is another one I liked because they connected introspection with the negative aspects, the warts.
"Jonathan Moyo, put it this way: ?ZANU PF needs to introspect a little and see itself warts and all. ...""
Now, here is the concept of brutal truth. Seeing your own failures and accepting them.

Then I decided to search for this expression, "see itself warts and all".

Now, Here is a definition of the role of journalism that uses that term, warts and all, and has a reference to a mirror.
"The role of journalism is to hold up a mirror to society so it can see itself, warts and all. The role of journalism is to shine a light on all sides of an issue as fairly and objectively as possible so people can make up their own minds."
Here is another one, about a guy who creates learning organizations
"I care about learning in organizations. I want to help organizations create processes, structures,
and cultures that support learning and change."
This is also an important factor in the introspector project, the aspect of learning.
He goes on "My work puts a mirror in front of the organization so that it can see itself, warts and all."
In the same page, Another reference to the mirror aspect, and warts.
"....we took people through a process of reflection and feedback that empowered them to learn and change"
I like that as well, the process of reflection and feedback makes an inanimate object come to life.

Now, we get to the light. What is the light that shines upon the introspector program itself? It is the light of the mind of the user of the software. So, the introspector is a tool to allow you to shine your mental light and see software, warts and all.

Here is an example of a somewhat mystical usage of the term introspection talking about a diagram of the four quadrants of spirituality :

"things like feelings, ideas, wishes, interior states, even things like mathematics and logic, none of which can be seen running around out there in the sensory world, but can only be accessed by looking within by introspection, awareness, contemplation, meditation, phenomenology, and so on. In figure 1, you can see a few representative items that you can be aware of if you introspect your own mind or awareness or experience things like sensations, feelings, images, symbols, concepts, and so on, none of which can be see in the exterior world."
That is kind interesting, talking about the expression of cultural inner values being expressed as magic. That has long been my belief that all these belief systems are externalizations of the inner beliefs of people.

Now, we turn to hits on the act of introspection.

It turns out that some people think that George Bush is capable of learning reflection.
"Last week was a week of introspection in Washington."
"Most notably, George W. Bush danced right up to ? but did not cross ? the line of admitting he might have actually made a mistake during his first term. "
?In a first-term press conference, [the president] said he could not remember any mistakes. Thursday, Bush was more reflective.

It can be that introspection is going to far, as pointed out in the indiaexpress.com

In fact, self-introspection almost becomes self-depreciation when the party says: ??One can take very good decisions, pass very correct political resolutions and give fine slogans. But unless party organizations exist and have live contacts with the masses, they will remain only on paper.??



Mike North writes in the chattanoogan, an article titled "In My Humble Opinion: Reflection, Introspection, Resolution And Execution"
"I use the last week of the year for reflection, introspection, and resolution. I encourage you to do the same. But how, practically speaking, does one go about such a process?

You can begin by starting the new year with a journal. Reflect upon the past year. Write down the joys, pain, successes and failures. Make a list of the people you love, and of those that you may not care for quite as much. Honestly assess where you are in your relationships and career or life goals. Keep the journal all year long. It will be invaluable come time to repeat the process next year.

The next step is introspection. Look at that list of failures. Were they your fault, or beyond your control? What could you have done differently? Do you deserve the credit for your successes? What about those people with whom you don't get along? Could you do more to get along with them? Are you where you want to be in life? If so, how do you prepare for the next step? If not, why not?

Be frank with yourself. Make note of the things that you know need improvement. The first part of any battle plan is a thorough analysis of the enemy, his strengths, and his weaknesses. This is no less important when the enemy is yourself.

Being totally honest with ourselves may be the hardest part of this process. We tend to rationalize. This tendency is the biggest obstacle to personal growth and improvement. The lack of self-discipline is a problem too, but even that is easier to correct if we'll admit that we're lazy sluggards totally lacking willpower."

Here is another example of reflection feeding decision :
“The book has come at the right time, when I am at an age (48 years) wherein I can reflect back about my own life and my work and also look forward to plan my future. I have a lot more time to think, take stock of my responsibilities and position myself in the art scenario.”

So, Reflection collects data. Introspection assigns semantics to them, resolutions are decisions made upon them, and the execution carries them out.
This reminds me of the scientific method, that of observation (reflection), hypothesis (introspection and resolution), Experimentation(Execution).

Summary

Now we get to the point of looking into the software. Software is some form of external expression of the inner thoughts of people. The introspector allows you reflect upon them. But in the end, the process of introspection, the assigning of values to this reflected data is work intensive. It requires a value system from the user. So the introspector learns to help the user evaluate the data collected from reflection. Then the decision making process kicks in, and the execution.

Looking at the entire program, not looking at just the surface allows you to really understand it. Looking at the data values used, looking at the documentation and specification about the various parts. Looking at the data structures, how they are used. All of this is planned as functionality of the introspector. Only when you have a single critical viewpoint from which you can observe the entire system will you also have a ground to stand on and be able to change it.

I hope that you find these examples as interesting and as instructive as I did.

Mike

Friday, January 14, 2005

Proposal for a apache index.rdf, rdf based httpd metadata

Dear All,

I have been looking into making my http archive more usable,

Being frustrated by the sourceforge file management system, I have moved to just placing files in the http server of my sf.site. This is really easy to do via ssh, without any password via shared key files.

Now, Recently, I have discovered something long known to most of you, that you can place directives that describe your files and a readme in your htaccess.

The directives that I am thinking about describing in rdf are from mod_autoindex mod_mime
Here is a simple article on this topic.

So, I have been thinking about how to make this operation simpler and prettier :
I propose a simple way to create the data for the .htaccess file out of an rdf file and later to create an apache module for implementing rdf directly into apache.

The idea would be able to create a index.rdf file for you website that would contain DC and RSS information about the files in the directory. This information would be converted to an .htacess file described by you rdf.

This whole thing can be implemented as a simple redland perl module that runs on the webserver and is triggered by a make.

The next step would be to create an site rss feed for the files themselves updated when they are added. Futher more to be able to create an html rendering of the rdf file as a special file in the directory.

All of that could be handled by an rdf file that is uploaded alongside the orginal file that is describing the file on the server and the processed by a perl script executed via ssh.

In the end, an apache module can be created to manage this entire process.

The advantage would be that apache can collect metadata directly about each file and store it in a simple and standard way. The metadata can be of course augmented with much more information about the meaning of the files themselves, once a rdf representation is available.

One type of information that would be useful would be a way to collect all the google references to each file that can be used to determine the effect of moving a file.

more on this later.

Thursday, January 06, 2005

n3v [n triple vector]an efficient memory representing of compiler graphs

Happy new year!

I have finally gotten around to start unifying the ideas of ice cubes and rdf.

The ice cube idea was to use a binary matrix representation of the graph as a N*N cube of data.

RDF/ntriples is based on making statements of triples that describe the graph.

I have today, built a new binary representation of the ntriples format. It is based on the idea of representing each uri as an index into a vector. This index should be as compact as possible, so we can exploit the cache of the computer.

This gets into the area of linear algebra, and the tools lapack will be interesting, and ScaLAPACK
provides a distributed processing mechanism for it. I will have to write more about that in the future.

Basically it boils down to creating an vector of uri, and assigning those uris an index. Optimally the index would be a perfect hashing function.

For the introspectors gcc graphs, this index is already there, it is the node id that was assigned during the traversal of the compiler graph, so I just extract out that number encoded in the uri of the node.

For the predicates, an id is assigned as a counter, first come first serve.

The program that does this is done by the n3v_converter.pl program.

It has the following parameters :
  • input_uri the uri (file:foo.ntriples) to parse
  • map_file the map file of predicates to indexs
  • output_file the output file to produce
  • debug_file the debug file to emit
  • PACKFORMAT the format of the binary file

PACKFORMAT are three chars, one for the subject, predicate and object.

It is passed directly to perls pack routine, one page that documents it is here
Here are some useful values, but it occurs to me that a fixed width char format might be interesting as well!

  • C An unsigned char value.
  • S An unsigned short value.
(This 'short' is _exactly_ 16 bits, which may differ from
what a local C compiler calls 'short'.)
  • I An unsigned integer value.
(This 'integer' is _at_least_ 32 bits wide. Its exact
size depends on what a local C compiler calls 'int',
and may even be larger than the 'long' described in
the next item.)
  • L An unsigned long value.
(This 'long' is _exactly_ 32 bits, which may differ from
what a local C compiler calls 'long'.)
The resulting packed input file can be read directly into memory.

Here is an example program that reads the Short/Char/Short triple stucture.
Here is the input file that is it hardwired (in terms of array size) to read.

here is my post to rdfig/swig on the freenode irc chat

more to follow.

mike

Friday, December 24, 2004

Idea of the Introspector:SWSH the SemanticWebSHell

First of all, Merry Christmas!

In this article, I want to propose an semantic web shell, the SWSH that will be the first key user interface component to the introspector.

The Name

I was looking at the CWM today, and then Swish, and was wondering about all these Semantic Web Acronyms.: Swish, Swig, Swap. The first name that came to my mind was Swash
(but later I found out that it is taken), and I thought what could swash that be?

What will SWSH be?

Well like bash, but for the semantic web. This is something that has been going through my mind as of late. It is related to applying the introspector to bash as well. I have long since planned to make an introspector interface to bash, and I have the need to pass more information into the gcc to guide the rdf output, so my plan was to describe all the parameters to the gcc in rdf so that I can relate them to the resulting output.

A semantic web shell will allow you to interface to any command from the shell, but the parameters, returns, environment variables and scripting are accessible via rdf resources.

Each shell script, each command, each variable, each invocation and each file are defined as rdf resources and they will be able to annotated.

The environment of your shell with be an rdf storage. The shell will allow you to wrap bash commands as resources as well, and describe them using rdf.

Networks of Pipes
The piping system is important, pipes will be able to be defined as rdf graphs, and the most interesting thing will be the logic that can be done on the data elements between the pipes.

You might decide to insert an agent to make some decision on the data in the pipe as it passes though, and split it out into multiple pipes depending on the value. This will all be possible.

Adding in semantics to existing output
The next innovation will be conversion of data from cut and split into rdf.
You will be able to say that "cut -d: -f 1,4,6" will return three columns and define a resource to describe them further. This will allow you to mark up text files.

Implementation :

Of course you are asking yourself how this maybe implemented.

One key component here will be replacing the getopt lib with redland. All of the options to all these tools will be able to passed via rdf.

Another key component will be replacing printf with an emission of an rdf statement. This will include all the information about the context of the program where it was called, and all the parameters extracted via the gcc introspector.

In order to be able to implement all of this in one lifetime, we will need the gcc::introspector to provide us with all the information need about the data structures of all the programs and we will need to translate the data structures into rdf. This can be done semi automatically however, like is seen with the serialization routines that are possible in java and C# when reflection is enabled.

As soon as the ability to traverse the ASTS of the gcc is stable and efficient, it will be feasible to create meta programs to create a base level rdf interface to almost any program. Then by marking up of the data structure with more advanced semantics via rdf, the binding can be customized and regenerated in iterative fashion.


Future Music

When the introspector is in full gear with modules for each command that is executed, you will be able to extract also the meta data out of the scripts. For example if you have a awk script, then the awk::introspector will give you an rdf dump of that script which can be processed further.

When you are compiling, you will be able to invoke configure, make and compile all driven from the metadata about your computer. The project data will be extracted out of make, the configuration data out of autoconf, the source code data out of the gcc. By introspecting over the Linux kernel via the gcc and the gcc itself, you will have all the metadata about your machine available.

The Linux kernel will also be enabled with a introspector, so that all the kernel symbols will be accessible via the rdf query interface. Shared libs and dependencies will also be rdf resources.
Include files, and libs as well.

At the lowest level, even the file system will be treated as an rdf resource, directories, files will be addressable and annotatable via rdf. The commands, ls and file will be able to return rdf objects as well.

By using rdf, we can unify all the tools of the Linux system from the kernel down to the shell and present a single point of contact for all information in the system.

Also, for source code, and files, we will be able trace the history of each file, The edits to it, the copying and linking of it etc. Via the shsh, the history file will give you all the information you ever could want. When cvs, svn have be added to this framework, then you will be able to trace all changes and associate them to who made them.

When the editors and web browsers have been enabled, then you will able to get rdf descriptions of each edit, and each history file. Then you will be able to much easier find out where files and changes came from. The editors can also use this information for highlighting and intelligent editing.

mike


Tuesday, December 21, 2004

Abusing CWM and n3 to grow bushy trees

Dear fellow hackers, I would like your advice, on this new proposed syntax for cwm and rdf.

There is one thing that has been bothering me about CWM, RDF/XML, N3, Turtle, and ntriples : The lack of ability to *easily* define and process *nice looking* structure that are larger than three. When I say, *pretty*, of course this is a relative statement of personal preference.

My Goal : Find a simple representation for making complex trees in RDF more pleasing to the eye.

Disclaimer : Maybe this is possible, maybe not with the current set of tools. I have not done all needed research either, this page collects my limited knowlege about the subject at the moment, hopefully others can give me pointers in the right direction. Maybe you will find my viewpoint amusing or interesting. RDFPath seems to be going in the right direction, I need to read more about this. Here is a article that takes a shot at rdf and gets blasted, I dont want do have that happen here.

Of course you can define bags, lists, alts using rdf. You can also define chains of objects with triples between them. This however does not have a pleasing syntax!

Lets get back what these triples are. The triple defines the edge in a graph, the starting point, the path and the ending point. But, the real root of the problem, the notation is secondary. The problem with notation, and the idea of triples

RDF Primer: "Sometimes it is not convenient to draw graphs when discussing them, so an alternative way of writing down the statements, called triples, is also used. In the triples notation, each statement in the graph is written as a simple triple of subject, predicate, and object, in that order. [...] Each triple corresponds to a single arc in the graph, complete with the arc's beginning and ending nodes (the subject and object of the statement). [...] However, the triples represent exactly the same information as the drawn graph, and this is a key point: what is fundamental to RDF is the graph model of the statements. The notation used to represent or depict the graph is secondary."

When we question the idea triple, and we want to get out of its limitations, we might end up going in the wrong direction! If you trapped in a triple, you go to a quad! then you want to get a pent (or even a suite )! I dont want to go there.

Why three? I think that this is a number that needs some questioning. I am firm beliver in questioning beliefs and assumptions. I want to take this back to the roots. But I am going to expand on the meaning of three in the human mind.

Three is the lucky number : Mr Ogbuji writes Thinking XML: Introducing N-Triples: "Three is the lucky number" That is a good start, but he does not address why the three.

The number 3 has a deep religious history.

According to the Numerology - Wikipedia, the free encyclopedia: "3 Three relates to expansiveness and learning through life experiences. It is considered to be lucky, and is often associated with money and good fortune. Three generally depicts several people joining together to achieve a common goal, whether through a social or professional affiliation. Although three possesses attributes of wisdom, understanding and knowledge, negatively it can exhibit pessimism, foolhardiness and unnecessary risk taking."

Chunky Gulas?

I say, keep it simple silly. The mind likes three, it is that simple.

Now, I have heard that the mind is able to remember three chunks of 2, but cannot find a reference to that. It is easy to count to three. The Mnemonic page on the wikipedia also gives reference to Chunking. This article discussed 7-+ 2 on webpages. EET Templates: "Because STM's capacity is limited to seven items, regardless of the complexity of those items, chunking allows the brain to automatically group certain items together. There is a interesting discussion of Chunking in the Natural Language Toolkit. Here is also an nice article on CHUNKING AND PHRASING AND THE DESIGN OF HUMAN-COMPUTER DIALOGUES.

The problem :

The concrete problem that I have is quite simple. I want to represent trees easier in cwm.

Lets say that I have a graph of
rdfs:type relationships, as an example.

Given the following turtle/n3 file :

@prefix : <#> .
:b a :a.
:b2 a :a.
:c a :b.
:c3 a :b2.
:d a :c.
:d2 a :c.
:d3 a :c3.
:d4 a :c3.
:d5 a :c3.
and the following cwm --filter

@prefix : <#> .

@prefix log: .
this log:forAll :s,:t,:u,:v.
{:t a :s. :u a :t.} log:implies { :s :t :u.}.


It produces the following, *pretty* output:


#Processed by Id: cwm.py,v 1.144 2003/09/14 20:20:20 timbl Exp
# Notation3 generation by
# notation3.py,v 1.146 2003/09/14 20:20:24 timbl Exp
@prefix : <#> .
@prefix log: .

:a :b :c;
:b2 :c3 .

:b :c :d,
:d2 .

:b2 :c3 :d3,
:d4,
:d5 .

#ENDS


Of course this is no longer the rdf that you know, because the :b and :b2 are not edges like you know them, The edges implied by the structure of the graph.

I would like to say that this graph has an implied tree structure, where downwards movement in the tree is the inverse of
rdfs:type
relationship.

This is great for simple inheritance hierarchies. But what I would really like is this

:a :b :c :d,
:d2 ;
:b2 :c3 :d3,
:d4,
:d5 .

You may ask, what does the ";" mean then? Well, it means close off this triple in the tree, going back three steps. So, ":d2;" means close off {:b :c :d2.}

Now, lets look at a more complex example
   :a     :b :c;

:b2 :c3 .

:b :c :d,
:d2 .

:b2 :c3 :d3,
:d4,
:d5 .

:c :d :e3,
:e4,
:e5 .
How would that look as a tree?


:a :b :c :d :e3,
:e4,
:e5;
:d2 ;
:b2 :c3 :d3,
:d4,
:d5 .

I think that this would be very easy to implement as a parser and a serializer.

The key would be to give the cwm a tip about the direction of the tree,
by defining the document as a tree, and the predicate to use for the direction it could be possible to parse and generate that tree document.

I would propose a new set of terms to describe how to parse this new syntax :
tree.owl this is the owl ontology file that defines a tree class and direction
tree.n3 this is the n3 equivalent
tree-build.n3 This is a cwm program to build a tree
tree-test.n3 this is the test data.

I look forward to your comments.

mike

Saturday, December 11, 2004

Questions to be answered by the introspector

Todays entry will start by defining the questions to be answered and give you a structure to the problem of the introspector rdf data.

Here are the questions to be answered :
  1. How can you describe code so that is can be generated?
    1. How can you transform from the high level structures that you want to think in into the structure of the code that the compiler uses
    2. How can you transform from the low level structures that the compiler stores the code in into the structure of the code that you want to read?
  2. What is the relationship between the ontology and the structure of the code?
    1. What is the structure of the rdf data?
    2. How can a rdf data set be converted into an ontology?
    3. How can you create code out of a ontology?
    4. How can you create the ontology out of the rdf data?
    5. How does the structure of the gcc affect the structure of the rdf data?
    6. How does the structure of the input source code affect the structure of the rdf data?
      1. What if the input source code to the introspector is the dumper code?
    7. How does the structure of the dumper code affect the structure of the rdf data?
  3. What is the relationship between the ontology and a query?
    1. How can an ontology be converted into a set of queries that checks a set of rdf data for meeting that ontology?
    2. How can a query be optimized by consuting an ontology?
  4. What is the relationship between an index and an ontology?
    1. What are the types of queries needed?
    2. What are the cardinalities of the data sets?
    3. What are histograms of values in a property?
    4. What are primary keys of class?
  5. How can a rdf data set have statistics extracted from it that feeds the indexing, ontology, query optimization aspects?
    1. How can we store this statistical data?
    2. How can we query this statistical data?
    3. What is the relationship beween an ontology and the statistical data extracted from a data set?
    4. Does this have to do with with the bayesian classification
      1. see also http://ebiquity.umbc.edu/v2.1/project/html/id/59/

Friday, December 10, 2004

Type Inversions : First step, Identification of problem

I have been reading Knuth[1] Vol 3 about sorting, where he first talks about inversions and permutations. This has gotten be thinking about sorting types.

One of the functions of the introspector has to be that of emitting source code, and for strongly typed languages, the source code has to be emitted in the right order.

One problem that has to be addressed by the introspector is that of sorting the declarations when they are emitted. Another is when a type requires a forward declaration because of a circular type definition.

The first thing that comes to mind is a topological sort. I think however that there is a more elegant way to do this. If each type is given a number, and each reference to another type contains that types number in the new type (by addition or multiplication or whatever) then the container type will always have a larger number!

By then finding the inversions, or the unsorted parts, we know where a forward reference is needed or even a sorting has to take place. When you have a pointer to a type then a special number will be needed that basically contains the number of a void *. Any pointer can be made to a forward. Maybe a negative number can be used for such pointers.

Anyway, this idea is not done, but I wanted to share it with you, and also write it down. Maybe you find it interesting. This is the first step of the identification of the problem and the capturing of the feeling and intuition, later this idea will be filled out.

References :
[1] Knuth, Donald E. The Art of Computer Programming. Vols 1-3. Addison-Wesley Publishing Company, Reading Massachusetts: 1967.

See also : http://planetmath.org/encyclopedia/SignatureOfAPermutation.html

Monday, December 06, 2004

Why support the Introspector, What is it?

Often I am asked, What is the introspector? What does it provide? Why should I support you? What makes your product unique?

Let me try and put it down for the record, at least a starting point.

When you program in C, you are often confronted with the problem that you want to wrap a structure in another to do this You want to be able to traverse over all the fields of a structure.

You ask yourself, why dont I have the ability to do this? Why does not the compiler support this simple feature? Well the answer is, the compiler * supports* this feature, you just dont have access to it! There are a wealth of features in the compiler that you are not given access to.

Now, you may ask yourself, why cannot I have access to this information, and my answer is you can! The introspector provides it.

Now, you may ask yourself : How will you be able to access this data? When will you get access to it? What will it look like?

The data will be first extracted into rdf. Then it will be presented to you via the rdf api redland. The question is, when will this code be invoked? My goal is to be able to present this data to you at compile time and at runtime!

It should be possible to write code that uses the metadata of the compiler that is usable at any given time. It should be possible to write expressions that are captured at compile time and made available at run time.

Let me give you an example :

Lets say you want to defined a new function called funky_print().

This function you want to make very flexible, so that it will be able to process any type that you pass to it. So, you declare it like this funky_print(void *).

Then you want to generate code to handle all types of calls to it, so you make an introspector query like this : intercept_calls(&funky_print,funky_print_handler);

Lets say you have the simple set of declarations :
funky_print (void * args);
funky_print_handler (void * args);
intercept_calls (void * function,void * process);

int main()
{
intercept_calls (&funky_print,&funky_print_handler);
return 1;
}

Now,
none of these functions are defined yet. But we have clearly be able to mark the relationships using the compiler, using native c. Now all we need to do is create a program that will process this data once it is extracted.

In the introspector you have the ability to capture all calls to a function. You can filter the rdf and extract all those calls via query. Then you can extract all the types of the parameters, all the sources of data from the caller. Basically you have all the data that the compiler does.

In this case you will be able to generate code for all types of the print routine and then replace the funky_print function calls with a specialized function.

The advantage of this technique is that you will be able define
your own meta data and meta functions inside and outside of the compiler without changing the compiler. The meta functions will be able to written in your language of choice and will allow you do have the freedom program the way you feel like.

more to follow.

mike

Tuesday, December 23, 2003

I created a new blogger template that allows you to emit rdf
Template
It is a simple way to turn your blog into rdf
Why pay for rss syndication from blogger.com/google?
You can just use cwm to process your blog in n3 format emitted by this archive template
thanks to irc:deltab@freenode.net#rdfig for iconv support
"iconv -t utf8 -f windows-1252 index.n3" to convert my blog into something readable
http://195.140.210.90/blogger

This is the first step to migrate to my new server using rdf

Saturday, November 15, 2003

I have release a new version of the introspector, a proof of concept,
something you can look at and learn from. A self contained demo program that allows you to graphically explore the structure of a almost any program that you can compile with the gcc!

It features the introspector ice cube.
The ice cube contains a superfast and compressed extract of the semantic data of the program that can be compiled in as a lib and loaded into memory in miliseconds.

The graph alogorithms are also very fast on constant size arrays of object!

Hopefully It will become the new way to embed a static semantic resources into your new programs.

We then slice the ice cube for each by Property into nice thin C arrays.

It has a gcc tree extracted out of the dotgnu pnet idlasm code emit function. That means i have reversed engineered an free software component.

The results of the reverse engineering are stored in a rdf repository. This has cwm,perl, and shell scripts doing semantic processing of the data. An redland RDF repository is used to interface into the guts of the gcc compiler.

The asts are serialized by a patched gcc3.4 experimental -fdump-translation-unit, you can find the source code in the cvs.

That is emitted into rdf and converted by a perl script into a ice cube.

That are served into slices of data, each attribute its own vector that has the length of the number of nodes in the selected rdf property. There is in fact a matrix of all the objects and relationships between them stored in the Array.

This program contains just the linux binary of the program that has all this data compiled into an ICE Cube :

That is emitted into a inline c array for compiling into the target program.

Please join up on the list, come to the #introspector chat zone on freenode.net, and jabber me at mdupont@nureality.ca

Monday, October 06, 2003

Long time, no blog. Really I dont like blogs. But I am trying to show Jaana how a blog works.

So I have been working on a new version of the introspector, on that is fast!!
Based on the idea of data cubes and OLAP, I am calling them ICE CUBES- like frozen.

An ice cube is a matrix representation of your programs code, well I am sure to explain this more later, but right now am too busy with jaana!. so here are the links :

icecubes
other scripts

Use at your own risk,

mike

Thursday, September 11, 2003

YAL - Yet Another Language

This is really exciting. I hope to use this as a good start for the introspector parrot.

Modelling Tools

List of some graph layout tools

Here you can find my advogato article on the introspector and its relationship to the gcc.

A threat to Free Software? The "GCC Introspector", No a threat to abusers of Power

Wednesday, August 20, 2003

The MetaWrap Project

Saturday, August 09, 2003

Wiki: RelatedProjects

here is a list of related projects to the introspector.

Tuesday, August 05, 2003

SourceForge.net: GSoap Client interface for GForge

gSOAP: SOAP C Web Services

Journal of mdupont (219735)

GSoap interface to GForge Working

Hi All,

You can find a 1.4 mb archive of my current working interface and binaries here :
http://demo.dotgnu.org/~mdupont/gforge/gforge-gsoap.tgz

The soap wsdl api is here :
http://cougaar.org/soap/SoapAPI.php?wsdl

Gforge is the successor to source forge
http://gforge.org/

Gsoap is a fast Soap Implementation
http://www.cs.fsu.edu/~engelen/soap.html

Monday, May 12, 2003

I realize that the introspector is really just enumerating over the ideas in the end. It also interacts with the mind, but in the end, it is almost a waste of time. The final results are what are measured. The time spent calculating are precious, and should be kept to a minimal.

mike

Thursday, May 08, 2003

Check out the introspector WIKI
The Introspector WIKI!

Tuesday, January 14, 2003


New system to track features for the introspector

Wednesday, January 08, 2003


Front End Art
This is a commerical c++ frontend for post processing.

Friday, November 29, 2002


The MINGW32 porting page
of the introspector.


Here is a small example of a function body converted to rdf

Here
>In the introspector cvs you will find the SQL database model for the postgres database.

Thursday, October 31, 2002


This is a function type and function decl, with parameters


Some of the classes involved in a function call

Wednesday, October 30, 2002


node base


This is a hyperlinked class model of the introspector.


The introspector class model, high level


Review



The class introspector is made up of two primary components, the input and the output. Input comes from interrupt routines inserted into attributes to create properties. This then allows for methods to be attached to feed into the graph layout algorithm. The graph layout algorithm feeds in turn the diagramer for drawing the diagrams. The compiler is fed from the diagram in a code generation schema. The compilers lexer feeds the parser that builds nodes.

Trees grow out of the seeds of those nodes. These trees are then strung togeather into new structures.

The gcc is made up of a cpp, preprocessor, the bison parse with the function yyparse. The gettoken function that reads a token from the stream is gotten from a lex interface. The cscc is just shown here as an interface to another possible compiler.




The introspector class model, high level

introspector make environment

This diagram shows you the flow of data from the shell into the make and the compiler. It does not show you how the parse tree looks.
That comes soon.

Monday, October 28, 2002


http://gcc.gnu.org/projects/ast-optimizer.html


Here is a cutting edge ast optimizer in the gcc.

Thursday, August 22, 2002

hypothesis :
1. API infomation is data.
2. By describing an API, we are talking about the interface to something
3. The difference betweeen a model and an API is that a model does not have to present an usable interface
4. The linker presents a set of functions and types to use

rpm2html : a generator of Web pages for RPM packages rpm2html automatically generates Web pages describing a set of RPM packages.

The goals of rpm2html are also to identify the dependencies between various packages and to find the package(s) providing the resources needed to install a given package. Every package is analyzed to retrieve its dependencies and the resources it offers. These relationships are expressed using hyperlinks in the generated pages. Finding the package providing the resource you need is just a matter of a few clicks!

rpmfind : the rpm2html client tool Basically, rpmfind is a program that will find RPM files on rufus for you.

For example, rpmfind gimp will tell you what packages are needed to install Gimp on your machine, where to find them, and how much space it will take on your hard drive (so you can also estimate the download time), and can fetch the required files for you.

Rpmfind can also be used to query the RPM database for existing packages using a keyword or a regular expression.

Rpm2Html: RDF schema for RPM (and others) binary packages The work done on rpm2html, i.e. indexing a large amount
of RPM packages,

http://rufus.w3.org/linux/rpm2html/
http://rufus.w3.org/linux/RPM/

make me believe that there is a real
need for a mechanism allowing easy propagation of the
metadata associated to RPM packages. The goal is to provide
a simple mechanism to export the important information
about RPM packages, without carrying on the full data.

The W3C has issued the RDF draft which in my opnion
suit perfectly the purpose required, I expect that soon
standard browser will be able to parse and display metadata
expressed in RDF format:

http://www.w3.org/Metadata/

I started trying to express the informations contained
in an RPM and needed by rpm2html export the RPM metatdata.

Metadata at W3C Metadata is machine understandable information for the web. The W3C Metadata Activity addressed the combined needs of several groups for a common framework to express assertions about information on the Web, and was superceded by the W3C Semantic Web Activity.

Redfoot.net What is Redfoot?

Redfoot is an extensible RDF server written in Python for building a Semantic Web of P2P nodes. It is being developed by Daniel Krech and James Tauber.
What does Redfoot Include?

* Console Interface
* Rednodes for forming P2P network
* HTTP Server Integration (Currently Medusa, Someday Zope)
* Schema Driven RDF Editor (Web Based)

Primer - Getting into the semantic web and RDF using N3 Primer: Getting into RDF & Semantic Web using N3

The world of the semantic web, as based on RDF, is really simple at the base. This article shows you how to get started. It uses a simplified teaching language -- Notation 3 or N3 -- which is basically equivalent to RDF in its XML syntax, but easier to scribble when getting started.

ITCSL: DAMLJessKB This software is intended to facilitate reading DAML files, interpreting the information as per the DAML language, and allowing the user to query on that information. In this software we leverage the existing RDF API (SiRPAC) to read in the DAML file as a collection of RDF triples. We use Jess (Java Expert System Shell) as a forward chaining production system which carries out the rules of the DAML language. The core Jess language is compatible with CLIPS and this work might be portable to that system. A similar approach is taken by DAML API, they also hook RDF API into Jess. However the bridge they use between the two is a little different and at the moment less complete in at least the publicly available version. Also, the SWI Prolog distribution includes a package to parse RDF and assert as Prolog facts. It should be possible to create a series of Prolog rules similar to the Jess rules used here and leverage that package to create a similar unit for Prolog.

NyktOP Project - Source Languages schema
(only in .pre; implicite in other generators) used as a form of directories to give a better overview of the different classes. Has its roots in Smalltalk categories. Used in C generator to make smaller source dirs.(In MOOSE they can be connected and thus U can decide) to generate certain components)basetype / simpletype
(not in awkinput) Used to define another type for non-structured attributes. Basetypes should be builtin to the generators (Boolean, String), Simpletypes user defined.enumerator
(not in awkinput) Used to define an enumeration type (relevant mainly for C , but very usable also in TCL)foreignclass
(only in .pre) Used to define a class, that must not be generated - e.g. from an external class library used as a type for attributes.record
(only in .pre) very similar to class, but can't define connections; used as a type for structured attributes.array / dynarray
(only in .pre) very similar to record, except that it contains an additional array functionality: U can access an indexed value.class
the most relevant structure for the generators. Thus we show a list of its most common subcomponents:

parent / parents
name(s) of the supertype(s) (multiple only in .pre-ul)schema
name of the schema, this class resides in.attribute
(slottype) define an attribute.reference
(slottype) define an single-sided reference (no support on target side needed).connection
(slottype) define an double-sided connection (the partner/target-object needs to support a responding slot).method
(slottype, only in .pre) define a new method in the class (used only for the C generator).
module
(only in .pre) define a module; may e.g. be used to include components, that do not fit into NyktOP class hierarchies, like old (ansi) C libraries.

Monday, August 19, 2002

I have been updating the source forge project page.


Check out the task manager.


mike

I have been thinking about the gcc introspector project.



The idea of intercepting a message that is sent from a programmer to the compiler, and trying to reconstruct it's original intent is like an excercize in dycrypting a message.



When we look at a scientific experiment that has been done by a scientist that is being recorded. This comes from a hypothesis that based on an observation. The object of our attention is a resource. The predicate is the key part of the statement.


This statement is a described by the RDF system.



The computer program is very much like a scientific experiment,
and encoding of thought.



If we try and capture this thought, then we are thinking about thought.



more to follow.




Saturday, July 20, 2002

DICE Documentation Welcome DICE project documentation!

Goal of the DICE project is to build an IDL compiler, which can translate an interface definition of a component into communication code for L4 micro-kernels. These components (or servers if you like) run on top of the L4 micro-kernel and shall build the Dresden Real-Time Operating System (DROPS) - that's where DICE is derived from: DROPS IDL Compiler. It currently supports two IDL languages (DCE and CORBA IDL) and generates communication code for L4 version 2 compliant micro-kernels.

L4Ka - IDL4 Compiler IDL4 is a stub-code generator for the L4 platform. It generates communication stubs from interface definitions written in a specification language such as CORBA IDL or DCE IDL. It also uses knowledge about the hardware platform and the microkernel to optimize the performance of the generated code.

Features

* Supports CORBA IDL (recommended) and DCE IDL
* Backends available for Fiasco, Hazelnut and Pistachio
* Small component-based OS included as example code
* Client stubs use CORBA C language mapping
* Type import from C/C code
* Modular design; easily extensible

Friday, July 19, 2002

Berlin and GNU * Utah's Flick RPC interface compiler already can compile CORBA
interfaces to use Mach RPC directly, rather than higher-overhead
Unix-domain sockets. Other IDL translators might also be modified
to take advantage of Mach RPC. This will mean a big performance
gain for Berlin running on GNU (definitely faster local
message-passing than on a traditional kernel).

omniORB omniORB is a robust high performance CORBA ORB for C and Python. It is freely available under the terms of the GNU Lesser General Public License (for the libraries), and GNU General Public License (for the tools). It is one of only three ORBs to be awarded the Open Group's Open Brand for CORBA. This means that omniORB has been tested and certified CORBA 2.1 compliant. You can find out more about the branding program at the Open Group.

Inter-Language Unification -- ILU The Inter-Language Unification system (ILU) is a multi-language object interface system. The object interfaces provided by ILU hide implementation distinctions between different languages, between different address spaces, and between operating system types. ILU can be used to build multi-lingual object-oriented libraries ("class libraries") with well-specified language-independent interfaces. It can also be used to implement distributed systems. It can also be used to define and document interfaces between the modules of non-distributed programs. ILU interfaces can be specified in either the OMG's CORBA Interface Definition Language (OMG IDL), or ILU's Interface Specification Language (ISL).

Thursday, July 18, 2002

SOAP::Lite contains a Net::Jabber transport among others
and an XML::RPC support.

It looks very promising.

Alternatives to CORBA xpidl is a tool for generating XPCOM interface information, based on XPIDL interface description files. It is based on the GNOME library for ORBit, libIDL.

Hey All,

I've been silent for a while on the topic of tools. I've been cranking
away getting the last pieces in place, and they finally work. I was
able to get 75% of the automation in place pretty simply, but then I
wanted to do the other 25% and it required some thinking, some trial
and error (mostly error), and a lot of coding.

Anyway, it's done:

All tests successful.
Files=126, Tests=828, 15 wallclock secs (11.22 cusr 2.11 csys = 13.33 CPU)

That would means the Bio::MAGE::* hierarchy has 126 test files (one
per class) with 828 total regression tests, and they all passed,
YEAH!!!

Tool Inventory:

* XMI.pm: perl module that rips pieces out of an XMI file, and
maintains the information in an internal data structure

* xmi2class.pl: a perl script that converts the internal data
structure into XML files:
- either one large file with all the classes
- individual files, one per class
- package files that contain all classes for a package.

* create-mage-classes.pl: a perl script that takes XML files created
using xmi2class.pl and outputs perl classes, makefiles, and
regression tests. The classes can be output in different ways:
- as packages: all classes belonging to the package are written to
the same package module
- as individ

Small-scale XMI programming: a revolution in UML tool use? XMI is not an easy format for a human to read, and even small models can translate into large XMI files. However, the big advantage of XMI being based on XML is that the whole range of generic XML tools is available. Developers writing scripts to work on code generally avoid the need to parse the code, but scripts working on XMI can easily take advantage of parse tree information, because XML parsers are available in every popular language. The ability to analyse and manipulate XMI files means:
Analyses or changes that used to be tedious to do by hand using the GUI of a UML tool can be automated; and so the temptation to let the UML model get out of step with the code is decreased. For example, changes made to the model may propagate to the code using tools' own forward/reverse engineering combinations, so that a developer may choose to make a change to the model and propagate it to the code rather than just changing the code.
Any developer can write a script to extract information from XMI files and turn it into the input format of a proprietary tool they may be familiar with.

CorbaTrace CorbaTrace is a helpful tool for tracing communications beetween Corba objects.
Once CorbaTrace is installed, remote calls are intercepted and all information is stored in XML log files. After that, you will be able to apply filters and produce XMI files to see the communications beetween distant objects on a UML sequence diagram.



Useful
When you develop distributed software, it's very hard to find bugs and to understand where the troubles are in your architecture. Corba is one of the best, complete and most used standard middleware architecture.
So we have made a tool that can easily trace communications beetween objects and see the result on a sequence diagram.

Dandelion Dandelion0.5beta1. Partial XMI support. You can now generate XMI UML files by Dandelion. Analyze your project and see it with the various case tools which support XMI (MagicDraw UML, Argo UML, etc)! Here is the MagicDraw sample screenshot.

Graphotoron - about Graphotron is a simple XML language for drawing graphs with XPath.
The inspiration came from a Rick Jellife description of Schematron, problably this one: The Schematron differs in basic concept from other schema languages in that it is not based on grammars but on finding tree patterns in the parsed document.
XML documents are trees and if some kind of intradocument linking is used, they can describe a general graph. It is often said that a good picture is worth of a thousand words and so I have started to look for a way how program something like pictorial Schematron. I have been pleased to find a few very good programs (Graphviz, VCG, daVinci) which have at least some versions freely available. I did some experiments and found out that drawing pictures with XPath can be very simple.

Graphotoron - about Graphotron is a simple XML language for drawing graphs with XPath.
The inspiration came from a Rick Jellife description of Schematron, problably this one: The Schematron differs in basic concept from other schema languages in that it is not based on grammars but on finding tree patterns in the parsed document.
XML documents are trees and if some kind of intradocument linking is used, they can describe a general graph. It is often said that a good picture is worth of a thousand words and so I have started to look for a way how program something like pictorial Schematron. I have been pleased to find a few very good programs (Graphviz, VCG, daVinci) which have at least some versions freely available. I did some experiments and found out that drawing pictures with XPath can be very simple.

Wednesday, July 17, 2002

ASPN : Modules GraphViz 1.4
Interface to the GraphViz graphing tool

Hyper Perl HyperPerl is a variant of WikiWikiWeb inspired by DonKnuth's LiterateProgramming. It is supported by two programs, both offered as cgi script...
wiki base -- browse & edit the hypertext source
hp -- extract a straight program by traversing the source
The extractor defines the rules of hyper perl. Right now it simply grabs preformatted text. [How is a Here file represented?] Links found in that text refer to nodes that should be processed in advance of the current node since they probably refer to ...
header comments
constant definitions
subroutine definitions
The extractor will soon offer forms to control its expansion. This will provide ...
fill-in-the-blanks for parameterization
check-boxes for conditional configuration

Graph Viz This is my first attempt to use GraphViz to trace links on wiki. Here I chose to follow the links to the two largest pages cited on any given page. I'm thinking the two most recently edited pages might be an even better choice. -- WardCunningham

Generic Pretty-Printing (Under Construction) The generic pretty-printer packages inludes tools for unparsing and pretty-printing (formatting) parse trees of any context-free grammar represented in AsFix. The language independent markup language Box is used to connect source language dependent front-ends to target language dependent back-ends. A front-end translates a term over a language to Box, to describe its intended layout. A back-end translates a Box term to some output format. Currently, the system supports translations from Box to AsFix, (ASCII) Text, LATEX, and HTML. This generic framework of front-ends and back-ends is depicted in the figure above.

Comp.compilers: yaccviso 1.0 - a tool for visualizing yacc/bison grammars YaccViso is a tool for visualizing dependencies of non terminal and
terminal symbols in a yacc grammar. If you do not know what yacc is
then you probably won't need YaccViso. It is thus a tool specifically
designed for people who write compilers with yacc who work on large
grammar files.

Tuesday, July 16, 2002

WebMacro: How Web Macro Works Introspection Rules

WebMacro follows the JavaBeans specification when performing class analysis, but also extends that specification.
JavaBeans is essentially a set of coding conventions for how to name accessor methods. If you follow those conventions then lots of tools, including WebMacro, will be able to analyze your class to learn what properties it has

Monday, July 15, 2002

SourceForge.net: Project Info - NeoClassIntrospector

NeoClassIntrospector is a Mac OS X Objective C framework created to provide introspection services for the Objective C language when running applications on the Apple Objective C runtime engine. I am opening this project on SourceForge because the code

Castor API: Class Introspector public final class Introspector
extends java.lang.Object
A Helper class for the Marshaller and Unmarshaller, basically the common code base between the two. This class handles the introspection to dynamically create descriptors.

Velocity 1.3-dev API: Class Introspector public class Introspector
extends IntrospectorBase
This basic function of this class is to return a Method object for a particular class given the name of a method and the parameters to the method in the form of an Object[] The first time the Introspector sees a class it creates a class method map for the class in question. Basically the class method map is a Hastable where Method objects are keyed by a concatenation of the method name and the names of classes that make up the parameters. For example, a method with the following signature: public void method(String a, StringBuffer b) would be mapped by the key: "method" "java.lang.String" "java.lang.StringBuffer" This mapping is performed for all the methods in a class and stored for

Java(TM) 2 Platform, Standard Edition, v1.2.2 API Specification: Class Introspector public class Introspector
extends Object
The Introspector class provides a standard way for tools to learn about the properties, events, and methods supported by a target Java Bean.
For each of those three kinds of information, the Introspector will separately analyze the bean's class and superclasses looking for either explicit or implicit information and use that information to build a BeanInfo object that comprehensively describes the target bean.

Southern Storm Software, Pty Ltd DotGNU Portable.NET
The goal of this project is to build a suite of free software tools to build and execute .NET applications, including a C# compiler, assembler, disassembler, and runtime engine. The initial target platform is GNU/Linux, with other platforms to follow in the future.

Sunday, July 14, 2002

Proposed OWL Knowledge Base Language Abstract

A proposed abstract syntax is given for the Web Ontology Language. This syntax is divided into a ``light'' or frame-like part and a ``full'' part.

http://graphs.memes.net/
memes.net - Home Node This site is for discussing the use of graph structures.

Feel free to discuss graph structures in community web sites, in topic maps, in information systems, in document management, whatever.

The Graph Visualization Library Visualization is an important aid for debugging graph algorithms. MLRISC provides a simple facility for displaying graphs that adheres to the graph interface. Two graph viewer back-ends are currently supported. (An interface to the dot tool is still available but is unsupported.)

* vcg -- this tool supports the browsing of hierarchical graphs, zoom in/zoom out functions. It can handle up to around 5000 nodes in a graph.
* daVinci -- this tool supports a separate ``survey'' view from the main view and text searching. This tool is slower than vcg but it has a nicer interface, and can handle up to around 500 nodes in a graph.

All graph viewing back-ends work in the same manner. They take a graph whose nodes and edges are annotated with layout instructions and translate these layout instructions into the target description language. For vcg, the target description language is GDL. For daVinci, it is a language based on s-expressions.

MLRISC Writing native code generators for modern processors is a significant investment. Unfortunately it is difficult to reuse this investment for other architectures, and even more difficult to reuse for other source language compilers. MLRISC is a customizable optimizing back-end written in Standard ML and has been successfully retargeted to multiple architectures. MLRISC deals elegantly with the special requirements imposed by the execution model of different high-level, typed languages, by allowing many components of the system to be customized to fit the source language semantics and runtime system requirements.

Standard ML of New Jersey License STANDARD ML OF NEW JERSEY COPYRIGHT NOTICE, LICENSE AND DISCLAIMER.

Copyright (c) 1989-1998 by Lucent Technologies

Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both the copyright notice and this permission notice and warranty disclaimer appear in supporting documentation, and that the name of Lucent Technologies, Bell Labs or any Lucent entity not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission.

Lucent disclaims all warranties with regard to this software, including all implied warranties of merchantability and fitness. In no event shall Lucent be liable for any special, indirect or consequential damages or any damages whatsoever resulting from loss of use, data or profits, whether in an action of contract, negligence or other tortious action, arising out of or in connection with the us e or performance of this software.

Google Image Result for www.gilbertwilliams.com/images/Introspector.jpg

Euler proof mechanism

This is a very interesting program. I might even install java to use it!

Or maybe the gjc!

http://www.w3.org/2000/10/swap/cwm.py Closed World Machine (also, in Wales, a valley - topologiclly a partially closed world perhaps?) This is an application which knows a certian amount of stuff and can manipulate it. It uses llyn, a (forward chaining) query engine, not an (backward chaining) inference engine: that is, it will apply all rules it can but won't figure out which ones to apply to prove something.

Friday, June 28, 2002

RefactoringBrowser: Refactoring Browser Welcome to the Refactoring Browser wiki. The Refactoring Browser is a Smalltalk browser that makes it much easier to refactor your program. It automatically supports many refactorings, and is constantly being improved. It works with several versions of Smalltalk, including VisualWorks and VisualAge. This wiki is a place to get questions answered, to find the latest source, to make suggestions, and to find out about upcoming changes.

Perl Knowledge-Based Objects

Perl ObjectFrames




Who?


Chris Mungall

What?


Unholy union of perl objects and frames

Why?


* fun

* dissolve code/data boundary

* introspection

* better modeling

* toss out cranky perl object model and replace it with one that
supports

bidirectional links

genuine introspection

slots

properties (aka associations) as first class "objects"

* frames can act just like regular objects if you want them to

* frames can be overridden with normal perl objects/packages

* normal perl objects can be extended with frames

* frames can be extended with normal perl objects

* natural framework for persistence

* reasoning

* will play well with RDF & RDFS (xml or n3)

* will play well with DAML+OIL

* natural framework for querying

* some of the above claims may be false

Audience?
---------

* mad hackers

* Semantic Webbies

* perl OO programmers with big object models and too lazy to:

define all the accessor methods

maintain boilerplate pod docs

continuously reverse engineer UML diagrams from code by hand
for those who can't grok just by looking at .pm / pod

basically write the same code over and over again
when a higher level spec would be nicer

* those who realise the standard OO paradigm, UML etc isn't sufficent
or expressive enough to
model complex domains (eg biology)

* perl people interested in
AI
Frames
Ontologies
DAML+OIL
Sematic Web
etc

* starry eyed ideallists

Status?
-------

* pre pre pre alpha

Why Not?
--------

* cf status

* sanity

* you're alreay using lisp and are quite happy with that thank you

* you're a serious AI boffin and this is the work of an amateur

Context
-------

www.semanticweb.org

search on google for ontologies / frames / etc

Thursday, June 27, 2002

The GNU Manifesto - GNU Project - Free Software Foundation (FSF) "Competition makes things get done better."
The paradigm of competition is a race: by rewarding the winner, we encourage everyone to run faster. When capitalism really works this way, it does a good job; but its defenders are wrong in assuming it always works this way. If the runners forget why the reward is offered and become intent on winning, no matter how, they may find other strategies--such as, attacking other runners. If the runners get into a fist fight, they will all finish late.
Proprietary and secret software is the moral equivalent of runners in a fist fight. Sad to say, the only referee we've got does not seem to object to fights; he just regulates them ("For every ten yards you run, you can fire one shot"). He really ought to break them up, and penalize runners for even trying to fight.

The GNU Manifesto - GNU Project - Free Software Foundation (FSF) "Don't people have a right to control how their creativity is used?"
"Control over the use of one's ideas" really constitutes control over other people's lives; and it is usually used to make their lives more difficult.

The GNU Manifesto - GNU Project - Free Software Foundation (FSF) Complete system sources will be available to everyone. As a result, a user who needs changes in the system will always be free to make them himself, or hire any available programmer or company to make them for him. Users will no longer be at the mercy of one programmer or company which owns the sources and is in sole position to make changes.
Schools will be able to provide a much more educational environment by encouraging all students to study and improve the system code. Harvard's computer lab used to have the policy that no program could be installed on the system if its sources were not on public display, and upheld it by actually refusing to install certain programs. I was very much inspired by this.

The GNU Manifesto - GNU Project - Free Software Foundation (FSF) By working on and using GNU rather than proprietary programs, we can be hospitable to everyone and obey the law. In addition, GNU serves as an example to inspire and a banner to rally others to join us in sharing. This can give us a feeling of harmony which is impossible if we use software that is not free. For about half the programmers I talk to, this is an important happiness that money cannot replace.

The GNU Manifesto - GNU Project - Free Software Foundation (FSF) Many programmers are unhappy about the commercialization of system software. It may enable them to make more money, but it requires them to feel in conflict with other programmers in general rather than feel as comrades. The fundamental act of friendship among programmers is the sharing of programs; marketing arrangements now typically used essentially forbid programmers to treat others as friends. The purchaser of software must choose between friendship and obeying the law. Naturally, many decide that friendship is more important. But those who believe in law often do not feel at ease with either choice. They become cynical and think that programming is just a way of making money.

http://lists.alt.org/pipermail/fsl-discuss/2002-June/000508.html [fsl-discuss] Introduction and A Modest Proposal for a GNU infrastructure license RGPL
James Michael DuPont mdupont777@yahoo.com
Fri, 21 Jun 2002 01:02:37 -0700 (PDT)
Previous message: [fsl-discuss] Announcing Greplaw
Next message: [fsl-discuss] Introduction and A Modest Proposal for a GNU infrastructure license RGPL
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]



Hello!

My name is James Michael DuPont,
I am working on a project to extend the GCC compiler.
The introspector.sf.net.

To safely implement this, a new license scheme that is
to be compatible with the GPL is needed.

Before I propose this let me point out a few things
that are often misunderstood :

1. The GPL protects the GCC against people not
publishing front and back ends to the compiler. With a
less restrictive license, we would have less of the
gcc.

2. Copyright does not cover the inputs and outputs and
usage(how and why it is run) of software. This license
attempts to via an END user license aggreement.

3. I am not trying to change the license of any
existing software, I am trying to access the internal
data of GPLed programs while protecting it as if it
were in memory of the program.

4. The motivation for creating this library is to
create a haven for free software tools to interact
with the meta data of the compilers.
It is not a ge

http://groups.google.com/groups?q=backend+gcc+stallman&hl=de&lr=&ie=UTF-8&oe=UTF8&selm=3r5n4j%24599%40cmcl2.NYU.EDU&rnum=5 Von:Richard Kenner (kenner@lab.ultra.nyu.edu)
Betrifft:Re: Modula-3 (Was: Comparison of languages for CS1 and CS2)
Newsgroups:comp.lang.ada, comp.lang.modula2, comp.lang.modula3
View: Complete Thread (29 articles) | Original Format
Datum:1995/06/08


In article <1995Jun7.101538.26580@wavehh.hanse.de> cracauer@wavehh.hanse.de (Martin Cracauer) writes:
>>I doubt it, what possible FSF goal can be met by such a separation, it
>>would seem to promote exactly that which FSF tries hard to prevent, the
>>creation of proprietary technologies which make use of free software.
>
>I don't think so. IMHO, the only reason why the separation isn't done
>is that noone spent the time on it.

Robert is quite correct. Richard Stallman has been adamantly opposed
to any attempt to split front ends into separate programs for
precisely the reason Robert gives.

>The GNU-Objective-C folks, for example, would be very happy to have
>their frontend mixed with the C frontent to allow Objctive-C
>programs.

True, but what does this have to do with the separation issue? It
seems instead that such mixture would be even *less* separated.

>From what I heared, the FSF would be happy to make the gcc backend an
>easier-to-use tool for language implementors.

Certainly true, but again this has nothing to do with separation of
the kind Robert was talking about.

>The backend has to be modified anyway2

Re: Converting the gcc backend to a library?
In message <200001092033.VAA01200@loewis.home.cs.tu-berlin.de>you write:
> > I submitted a patch in 1994, which turns the gcc backend into a
> > library. I was told that it would lead to the gcc backend based
> > commercial compiler. I am enclosing my ChangeLog here. I guess
> > I can recreate it if I won't waste my time this time.
>
> I hope this objection is not valid anymore (if it ever was). If if the
> gcc backend is a library, derived compilers would still have to be
> licensed under the GPL. And if somebody wants to produce a commercial
> compiler that is available under GPL terms, so be it. I know companies
> who do this right know...
If it takes a significant step towards the ability to turn the backend into
a shared library, then it has that kind of potential. Changes of that nature
would have to be OK'd by the steering committee, not any individual maintainer
due to the political issues.

Re: Converting the gcc backend to a library? > I submitted a patch in 1994, which turns the gcc backend into a
> library. I was told that it would lead to the gcc backend based
> commercial compiler. I am enclosing my ChangeLog here. I guess
> I can recreate it if I won't waste my time this time.

I hope this objection is not valid anymore (if it ever was). If if the
gcc backend is a library, derived compilers would still have to be
licensed under the GPL. And if somebody wants to produce a commercial
compiler that is available under GPL terms, so be it. I know companies
who do this right know...

Re: Converting the gcc backend to a library? On Dec 26, 1999, Mikael Djurfeldt wrote:

> Is there any work going on to convert the gcc backend to a library?

Only in the sense that it has been proposed before, ``blessed'' by the
maintainers. It's been in my personal wishlist for quite a long time,
and I intend to start working on it in some no-so-distant future.

> How difficult would this be?

If it is just a matter of packing the appropriate objects into a
library, not much. It shouldn't be too hard to figure out which files
in the gcc directory are part of the back-end and which are part of
the C front-end.

However, making this library useful after installation may be a
completely different issue, since we'd have to figure out which
headers to install to make the library useful after the build tree is
gone, for example.

Guile (About Guile) What is Guile? What can it do for you?
Guile is a library designed to help programmers create flexible applications. Using guile in an application allows programmers to write plug-ins, or modules (there are many names, but the concept is essentially the same) and users to use them to have an application fit their needs.
There is a long list of proven applications that employ extension languages. Successful and long-lived examples in the free software world are GNU Emacs and The GIMP.
Very popular examples of extending server applications are the apache projects perl and php modules.
Extension languages allow users, programmers, and third-party developers to add features to a program without having to re-write the program as a whole, and it allows people extending a program to co-operate with each other, without having to expend any extra effort.

The GNU Compiler Writer's Jump Point The GNU Compiler Writer's Jump Point
Welcome to the GNU Compiler Writer's Jump Point. Right now this page is only dedicated to providing information about writing new language front ends to GCC. Later, more information will be available.
Information on the GCC Backend Tree Interface
The tree data structure is defined in the files tree.def and tree.h in the GCC source code. This structure is the link between a language front-end and the GCC back-end. Although most of these links do not address this issue directly, they provide helpful information about the subject.

http://gcc.gnu.org/ml/gcc/1999-12/msg00539.html Mikael Djurfeldt wrote:

> That is, I'd like to link my interpreter with the gcc backend.

This is a very FAQ, for which there is no good answer. Here are
some resources which I have collected.

There is also 'Using and Porting GNU CC' which comes with GCC and
you can salso buy a hardcopy from the FSF.

Integration with the GCC back end is mainly via the tree.[ch] and
tree.def interface and the assorted subroutines that come with
that. You don't actually generate the RTL yourself - the routines
do that for you from the tree nodes.

You will see many complaints in the documentation that the front
end and back end are not cleanly separated. Also the back end is
more friendly to C front ends at least though this is not a show
stopper.

The back end interface is also pretty low level and could
usefully be encapsulated into some more friendly interface. In
that sense a library is a good idea. I am trying to integrate a
cobol front end into gcc at present and I will probably do some
encapsulation, but this does not have a fixed end date! So as far
as I know, you will have to do a fair bit of work.

Having said that, GCC back end provides tremendous value - 30
platforms supported out automatically; lots of optimisation;
automatic creation of numerous varieties of debugging
information, ability to use gdb for debugging.

I'm not sure also if you are aware

Re: Converting the gcc backend to a library? > Is the gcc development team interested in one or two of these
> developments if they were done right?

Even though I cannot speak for the team (just for myself), I think it
is a safe assumption that the answer is "yes". The critical matter is
that these changes (like any other) have to be done "right": they
should be well-documented, do not interfere with "normal" operation,
be maintainable, etc.

For step 2, it is not all that clear to me that the best solution is
to pack the globals into data structures, as you then have to pass the
pointer to the globals around from function to function. That would be
a significant change, and unnecessary for stand-alone operation.

Of course, if you can arrange to provide additional advantages for
stand-alone operation as well (eg. compiling multiple files in a
single cc1/cc1plus/f771/etc invocation), then that might provide the
rationale for an even larger change.

Converting the gcc backend to a library? Converting the gcc backend to a library?
To: gcc at gcc dot gnu dot org
Subject: Converting the gcc backend to a library?
From: Mikael Djurfeldt
Date: Sun, 26 Dec 1999 19:53:37 0100
Reply-to: Mikael Djurfeldt



I'm interested in writing an interpreter which incrementally compiles
functions and loads them into the process, i.e. an interactive
development environment which still can execute code efficiently.

I'd like to write a front-end which transforms the source language
into RTL, calls the backend to produce object code, and loads it.

That is, I'd like to link my interpreter with the gcc backend.

Is there any work going on to convert the gcc backend to a library?
How difficult would this be?

If it requires reasonable amount of work, I'd be prepared to provide a
helping hand. Is this something which the gcc development team could
imagine include into the gcc distribution?

[I've just joined the gcc list, so I'm not sure how out-of-context I
am when asking this question.]

Best regards,
/mdj

http://gcc.gnu.org/ml/gcc/2000-01/msg00572.html To: gcc at gcc dot gnu dot org
Subject: Re: Converting the gcc backend to a library?
From: Richard Stallman
Date: Mon, 17 Jan 2000 19:51:25 -0700 (MST)
Reply-to: rms at gnu dot org



Companies often try to make software non-free, and some would write
non-free add-ons to GCC if we let them. The reason we have free C
and Objective C support is because the companies which wrote these
front ends had no *feasible* way to use them without making them part
of GCC, where the GPL required them to be free. It is vital that we
preserve this situation.

Anything that makes it easier to use GCC back ends without GCC front
ends--or simply brings GCC a big step closer to a form that would make
such usage easy--would endanger our leverage for causing new front
ends to be free.

Because of this, the GNU Project will in all probability not install
such changes, should they be available. This statement reflects a
firm conclusion based on a decade of thought.

I ask anyone who would like to make such changes in GCC to please
contact me privately. I would like to talk with you about the ideas
you are interested in working on, to look at the magnitude of their
potential benefits, and consider other possible ways of achieving
them. Please think about the importance of future free front ends,
as well as the interest of the project you are2

SRC Modula-3 home page C Modula-3
This region of the web describes SRC Modula-3 -- a Modula-3 compiler for Unix, Windows/NT, and Windows 95 -- and the terms under which it is distributed.
SRC Modula-3 was built by the DEC Systems Reseach Center and is available via anonymous ftp from gatekeeper.dec.com in pub/DEC/Modula-3/release-X.Y and its mirrors.
The distribution contains a Modula-3 compiler and runtime, a large set of libraries, and a few other tools. On the Unix platforms, the compiler uses a gcc-based back-end and should be fairly easy to port. On the Win32 platforms, the compiler uses a native code generator. Except for the very lowest levels of the thread implementation, the entire system is written in Modula-3.
The original compiler and runtime system were designed and implemented by Bill Kalsow and Eric Muller. The system should be of interest to two camps: those interested in trying out Modula-3 and those interested in compiler hacking.

http://groups.google.com/groups?q=:Tim+Mann+(mann%40pa.dec.com)+stallman&hl=de&lr=&ie=UTF-8&oe=UTF8&selm=516vkm%24509%40src-news.pa.dec.com&rnum=2 As the maintainer of a program that's partly under the GPL, I'd like
to wade in on this licensing discussion.

My own view on free software is this: If someone wants to write
software and give it away for free use by anyone, he should be able to
do that. If he wants to keep it proprietary and charge for the right
to use it, he should be able to do that.

Personally, when I write software that I want to give away, I am happy
for anyone to use it in any way they please. If they want to take
some of my code and incorporate it along with some of their work into
a commercial, proprietary product, that does not bother me. If I'm
giving code away, I'm giving it to everyone, not only to other people
who also want to give their code away. If someone uses my code in a
proprietary product that does not supply enough added value to make it
worth the price and to compensate for the source not being available,
then people will just use my free version instead. No one can stop
them from doing this if I retain the copyright on it and grant them
the necessary rights. (See the X consortium license and the BSD
license for examples of how this is done.)

This is not Richard Stallman's view. He says that the copyright law
is immoral. All software should be free, regardless of whether the
person who wrote it wants it to be free or not. The GPL is written in

http://groups.google.com/groups?q=frontend+backend+gcc+stallman&hl=de&lr=&ie=UTF-8&oe=UTF8&selm=8h4jmt%24n9c%40src-news.pa.dec.com&rnum=1 m3.c and the related modifications to gcc are not in the gcc CVS tree
because Richard Stallman does not want them. The whole purpose of m3.c
is to allow gcc to act as a backend for a frontend that is a separate
program, communicating over a pipe. Because the frontend and backend
are separate programs with this arrangement, the frontend does not have
to be under the GPL. This provides an escape hatch from the GPL's so-called
"viral" nature (all additions or modifications to a GPL'ed program must
also be GPL'ed) and thus conflicts with Stallman's goals. Modula-3 does
use this escape hatch; the compiler frontend is under the SRC Modula-3
license, a quite liberal license that fits the Open Source definition
but is not compatible with the GPL.

Thus, although we could probably have gotten management permission to
transfer m3.c's ownership to the FSF, the FSF would then have refused
to distribute it, so that would not have been too useful. Instead we
put m3.c under the GPL (which is necessary because it's linked into the
same address space as part of gcc), but left the ownership with Compaq.
(By the way, this actually wouldn't prevent the FSF from distributing
the code *if they wanted it*; the last time I looked, there were several
files in the gcc distribution that were under the GPL but copyrighted
to some entity other than the FSF.)

An alternat

Google Groups: Ansicht Diskussionsthemen "GPL and NDAs"

In article <1995Jun7.101538.26580@wavehh.hanse.de> cracauer@wavehh.hanse.de (Martin Cracauer) writes:
>>I doubt it, what possible FSF goal can be met by such a separation, it
>>would seem to promote exactly that which FSF tries hard to prevent, the
>>creation of proprietary technologies which make use of free software.
>
>I don't think so. IMHO, the only reason why the separation isn't done
>is that noone spent the time on it.

Robert is quite correct. Richard Stallman has been adamantly opposed
to any attempt to split front ends into separate programs for
precisely the reason Robert gives.

>The GNU-Objective-C folks, for example, would be very happy to have
>their frontend mixed with the C++ frontent to allow Objctive-C++
>programs.

True, but what does this have to do with the separation issue? It
seems instead that such mixture would be even *less* separated.

>From what I heared, the FSF would be happy to make the gcc backend an
>easier-to-use tool for language implementors.

Certainly true, but again this has nothing to do with separation of
the kind Robert was talking about.

>The backend has to be modified anyway to allow better optimization for the
>deeper pipelines of modern CPUs.

There are lots of optimizations that could be added to the backend to
produce better code; GCC does reasonably well now with pipeline
delays, so the above doesn't appear to be one of the more important.

However, again, what does this have to do with the separation issue?

>If the current way to implement m3 causes major overhead, maybe the
>time required for every m3-inside-gcc-backend release would be better
>invested in turning gcc into something that causes less overhead.

The experiences of the GNAT project show how valuable GCC has been to
language implementation since they were able to concentrate solely on
language issues and leave code generation, optimization, and
portability issues to GCC.

However, everything can be improved. I welcome your suggestions for
making it easier to write front ends to GCC.

HPANDF: High Performance ANDF (Architecture Neutral Distribution Format) HPANDF: High Performance ANDF (Architecture Neutral Distribution Format)



SUMMARY
HPANDF is an extension of OSF's ANDF required to provide language-independent front-end / back-end factorization for data parallel languages such as HPF, APC or HPADA. This project develops the HPANDF design and builds a prototype implementation. At the current stage, the ANDF extensions required to support array operations such as Fortran90 matrix arithmetic are implemented and we are designing the translation of ALIGN and DISTRIBUTE directives.

Comp.compilers: Re: Generating Java Bytecode > [Before you head down this path, you really should look at the
> history and learn why all the previous UNCOL projects failed. They
> all looked great with one or two input languages and targets, then
> collapsed of heat death when they tried to generalize more. -John]

Has this happened to ANDF yet? ANDF strikes me (as an idea; I have no
experience of it in practice) as a Good Thing, because it keeps lots
of high-level information which is of potential use in producing
machine code from ANDF. For example, although array bounds checks
would be lovely, I'd like my final code to contain very few of them,
particularly inside loops! I can imagine it being difficult to safely
remove bounds checks in a poorly designed bytecode: one would
potentially have to reconstruct the loops and things.

On the other hand, ANDF probably isn't appropriate for chips to execute.
[Haven't heard much about ANDF lately, perhaps Stavros M. can point us
at recent info. -John]

Michael Matz - Re: java bytecode considered bad Re: java bytecode considered bad
To: Trent Waddington
Subject: Re: java bytecode considered bad
From: Michael Matz
Date: Fri, 23 Feb 2001 01:24:14 0100 (MET)
cc: Gerald Pfeifer , Fergus Henderson ,



Hi,

On Fri, 23 Feb 2001, Trent Waddington wrote:
> > I don't see why we should want to reduce the functionality for users of
> > GCC to avoid that; unless I miss something, that is.
> >
>
> finally, the voice of sanity.

Well, in case you haven't noticed. Most of us are not of the opinion of
RMS on this topic (this special one, using JBC as IL; not generally the
fear of having a feasible IL, which some share with him).

> If you were going to choose an intermediate language to dump to
> proprietory backend's, wouldn't a low level C be more useful than JBC?

I think neither JBC nor C are feasible ILs, because too much information
is lost, and e.g. C allows things which makes a mapping from another
language to C unnecessarily inefficient, when compiled from that "IL"
(e.g. pointers).


Ciao,
Michael.

http://www.eecs.harvard.edu/~nr/pubs/c--gc-abstract.html Hi,

On Wed, 21 Feb 2001, Fergus Henderson wrote:

> > If it is possible to compile languages such as C into Java byte codes,
> > I see a great danger. The danger is that people will use Java byte
> > codes to hook GCC up to proprietary back ends and proprietary front
> > ends.
>
> People can already hook GCC up to proprietry front ends by simply
> having their front end generate C code. There are certainly a number

I think the fear of RMS is more, that people could write a new _backend_
for their hardware, do not publish it, but still use all the nice
frontends of GCC. With an intermediate language they could do this
legally, cause they don't have to link to GCC, but only write a reader for
that IL. This backend, because propritary and targeted to one hardware
could probably generate better code, than GCC in general, so it might be
successfull, although it "steals" all the hard work put in our frontends.
Without a feasible IL which GCC can generate this becomes legally
impossible. Now for this case, I don't know, if Java BC is a feasible IL
(meaning all languages can be compiled without loss of semantic and
information into JBC. Without loss of semantic can work of course, after
all that's the point of compiling. But without loss of information might
be impossible. That goal would be needed to generate efficient code
(think e.g.2

C-- Home Welcome to C--
Suppose you are writing a compiler; how will you get quality machine code? Writing your own code generator is a lot of work, so you might use somebody else's. There appear to be three notable, freely available code generators: VPO, MLRISC, and the gcc back end. Each of these impressive systems has a rich, complex, and ill-documented interface---and once you start to use one, you will be unable to switch easily to another. Furthermore, to use MLRISC you must write your front end in ML, to use gcc you must write it in C, and so on.
You might also generate C, if you can afford its calling conventions. And forget about proper tail calls, computed gotos, accurate garbage collection, efficient exceptions, or source-level debugging.
All of this is most unsatisfactory. It would be much better to have one portable assembly language that could be generated by a front end and implemented by any of the available code generators. Such a language should serve as the interface between high-level compilers and retargetable, optimizing code generators. Life would be easier for compiler writers, and authors of code generators would have customers instantly. C-- is that language.

http://www.eecs.harvard.edu/~nr/pubs/c--gc-abstract.html C--: a portable assembly language that supports garbage collection (Abstract)
Simon Peyton Jones, Norman Ramsey, and Fermin Reig
For a compiler writer, generating good machine code for a variety of platforms is hard work. One might try to reuse a retargetable code generator, but code generators are complex and difficult to use, and they limit one's choice of implementation language. One might try to use C as a portable assembly language, but C limits the compiler writer's flexibility and the performance of the resulting code. The wide use of C, despite these drawbacks, argues for a portable assembly language. C-- is a new language designed expressly for this purpose. The use of a portable assembly language introduces new problems in the support of such high-level run-time services as garbage collection, exception handling, concurrency, profiling, and debugging. We address these problems by combining the C-- language with a C-- run-time interface. The combination is designed to allow the compiler writer a choice of source-language semantics and implementation techniques, while still providing good performance.
This paper is available in TeX DVI format (101K), PostScript (407K), and PDF (104K).
Readers may also be interested in Machine-independent support for garbage collection, debugging, exception handling, and concurrency. This older report covers substantially the same material, but it goes into much greater detail about other high-level services, and it presents the

http://gcc.gnu.org/ml/gcc/2001-02/msg00895.html java bytecode considered bad
To:
Subject: java bytecode considered bad
From: Trent Waddington
Date: Wed, 21 Feb 2001 03:19:22 1000 (GMT 1000)




Following is a dialog I have had with RMS over the last few weeks. The
skinny of it is that RMS thinks having gcc both generate and accept as an
input java bytecode allows folks to do nasty proprietary things with gcc
so he's not interested in the backend for the jvm which I wrote 18 months
ago (and doesn't think anyone else should be). I have tried to explain
that java bytecode (especially the stuff I generate) is not a good
intermediate language... I'll let the list handle it.

------------

>From s337240@student.uq.edu.au Wed Feb 21 03:11:22 2001
Date: Wed, 7 Feb 2001 08:00:02 1000 (GMT 1000)
From: Trent Waddington
To: rms@ai.mit.edu
Subject: java backend

Hi. I dont know if you remember me but I worked on a java bytecode
backend to gcc which was released early last year. At the time I was
instructed that it would be impossible to get the copyright on the backend
assigned to the FSF. I would like to try again to obtain the copyright
assignment as the ownership of this code is no longer a priority to the
university that I was working for at the time. As far as I know gcc does
n

Fergus Henderson - Re: java bytecode considered bad On 21-Feb-2001, Trent Waddington wrote:
> From: Richard Stallman
> To: s337240@student.uq.edu.au
> Subject: Re: java backend
>
> If it is possible to compile languages such as C into Java byte codes,
> I see a great danger. The danger is that people will use Java byte
> codes to hook GCC up to proprietary back ends and proprietary front
> ends.

People can already hook GCC up to proprietry front ends by simply
having their front end generate C code. There are certainly a number
of proprietry language front ends around that generate C code, e.g.
Cfront (the original C compiler), some Eiffel compilers, etc., as
well as many free software compilers that do the same. And if you
plan to use the GNU back-end, then C is almost certainly going to be a
better target language than Java or Java byte codes. So I think the
worry about proprietry front ends is misplaced. The cat is out of the
bag, and whatever damage will result from that is already unstoppable.

--
Fergus Henderson | "I have always known that the pursuit
| of excellence is a lethal habit"
WWW: | -- the last words of T. S. Garp.

http://www.info.uni-karlsruhe.de/~andf/ What is the ANDF technology?
The Architectural Neutral Distribution Format (ANDF) is a software porting technology, which makes it possible to develop shrink-wrapped software for open systems, independent of any particular processor architecture. The ANDF intermediate format is also often seen - and used - as a compiler technology. The specification defines an integration interface between the two major components of a multi-platform cross-compilation system. The compilation of the source code is turned into a two stage process. In the first (ANDF Producer) stage, the application is transcribed into an ANDF format which utilises generalised declarations of the API calls used, together with generalised definitions of data types, constants, and so on. In the second (ANDF Installer) stage, the entities generated in the first stage are linked together and then mapped onto a concrete machine through the use of processor-specific libraries which implement the API calls and data formats. This phase also includes the optimized mapping to machine code.
The ANDF standard is a result of a request for technology (RFT) by the Open Software Foundation (OSF) in May 1989. In June 1991, DRA's TenDRA (TDF) technology was selected by OSF to be the base technology for OSF's ANDF technology.
Today a lot of tools are available, frontends for many common languages and especially a great variety of installers is available.
Please find more detailed information about ANDF in our "Technical Info" s

Geocrawler.com - gcc-help - ANDF

http://slashdot.org/features/98/10/20/116240.shtml Bruce Stephens has written in with a writeup on something he considers pretty cool- perhaps you'll agree. It's about something you may not have heard of:ANDF. Looks interesting for the hardcore.
Whatever happened to ANDF?
Once upon a time, there was a neat technology called ANDF: Architecture Neutral Distribution Format. It even merits a place in the GNU Project's task list: provide a decompiler for ANDF. (I'm not sure whether this item is still there; it's probably not high on the list of tasks!)
I haven't seen it mentioned much in years, however.
How does ANDF work?
ANDF is a format: it's a flattened representation of the abstract syntax tree for a program. Programs are compiled using a number of tools:
A producer, which produces target-independent ANDF from source
A linker, which links together some target-independent ANDF capsules
An installer, which combines target-independent capsules and target-specific ones, and knows how to produce target-specific code.
Much of 3 can actually be fairly portable. Many optimizations can be cast as portable manipulations of an abstract syntax tree. What's more target-specific is which of these manipulations you use. More than that; ANDF can represent a range of levels of detail, so even quite low-level things can be done using code that is shared between targets.
So, if you want to draw a line between developers (who compile code to produce binaries) and users (who just want to use the

TenDRA TenDRA is a portable C and C compiler, developed by DERA around the TDF/ANDF intermediate format. As well as being a compiler, it can provide strict checks for language and API conformance. A feature about ANDF on SlashDot provoked a good deal of discussion on the capabilities of the compiler and the merits and demerits of TenDRA and ANDF.
TenDRA is free software, with packages being avalible for both Debian GNU/Linux and FreeBSD. The Debian package is currently in the unstable distribution, which is due to be released as Debian 2.2. The FreeBSD port is available from their ports collection.

SOFTWARE RIGHTS
$Id: //depot/code/org.antlr/release/antlr-2.7.1/RIGHTS#1 $

ANTLR 1989-2000 Developed by jGuru.com (MageLang Institute),
http://www.ANTLR.org and http://www.jGuru.com

We reserve no legal rights to the ANTLR--it is fully in the
public domain. An individual or company may do whatever
they wish with source code distributed with ANTLR or the
code generated by ANTLR, including the incorporation of
ANTLR, or its output, into commerical software.

We encourage users to develop software with ANTLR. However,
we do ask that credit is given to us for developing
ANTLR. By "credit", we mean that if you use ANTLR or
incorporate any source code into one of your programs
(commercial product, research project, or otherwise) that
you acknowledge this fact somewhere in the documentation,
research report, etc... If you like ANTLR and have
developed a nice tool with the output, please mention that
you developed it using ANTLR. In addition, we ask that the
headers remain intact in our source code. As long as these
guidelines are kept, we expect to continue enhancing this
system and expect to make other tools available as they are
completed.

OpenCyc This document describes the Cyc Application Programmers Interface (API). This is the protocol which allows applications to connect to and use the various CycL modules and functionality which together are used to maintain the Cyc Knowledge Base.
The Cyc API is divided into two layers :
The Content Layer categorizes the available functions, and provides the function signatures and documentation used by applications. The java CycAccess class contains the majority of these functions for ease of integration.
The Transport Layer establishes connection to a Cyc server and performs the message handling. The ASCII telnet connection is useful for debugging and remote administration, and the binary CFASL interface is used for applications. The java CycConnection, CfaslInputStream and CfaslOutputStream classes contain the transport layer functions.

SourceForge.net: Project Info - Web Resource Application Framework

http://www.linux-mag.com/2001-04/GCC_net_01.html GCC.NET
What is Required for GCC to Support Microsoft's .NET?
by Mark Mitchell

The February 2001 Linux Magazine presented an article entitled Embrace and Extend: What Can Linux Learn from Microsoft's .NET? In that piece, Jon Udell put forth the notion that Microsoft's .NET initiative is built upon a number of ideas that have substantial technical merit and argued that GNU/Linux users ought to consider embracing and extending the platform.
However, while that article laid out many of the reasons why .NET might be interesting to GNU/Linux aficionados, it did not spend much time on the technical aspects of how supporting .NET on GNU/Linux would work. Because this is a topic worthy of further discussion, we will take an in-depth look at what it will take to make the GNU Compiler Collection (GCC) support .NET.
The .NET initiative is Microsoft's bid to permit the development of components, written in a wide variety of programming languages, that can execute on a wide variety of operating systems and hardware platforms. Put more simply, Microsoft thinks you'll be able to write Python code on a SPARC Solaris workstation, Visual Basic code on a Windows NT machine -- and seamlessly link and execute the two together as one program on an embedded system. You can even add native libraries provided by the embedded system vendor. Furthermore, the resulting program will run fast, because an optimizing compiler designed especially for the embedded system will compile

http://www.linux-mag.com/2001-04/GCC_net_02.html Modifying GCC to Support .NET
When we talk about GCC supporting .NET, there are several different things we could mean. We could mean that GCC would emit .NET IL (Intermediate Language) for any of the source languages that it can process. Then you could take your C program, compile it with G and obtain .NET IL. The resulting program could then be run on any system that supports .NET.
Another thing that we might mean is that GCC could process the .NET IL as input and emit machine code. In this case, you could take .NET IL generated by any .NET compiler and compile it to run on any system supported by GCC. For example, you might use Microsoft's C# compiler to generate .NET IL and then use GCC to transform the .NET IL into x86 code that could run under GNU/Linux.
The last thing we might mean is that GCC could accept the C# source language proposed by Microsoft. This language is specifically designed to target .NET. C# is Microsoft's language of choice for .NET even though .NET specifically supports multiple source languages. In this alternative, GCC could accept a C# program as input and, say, compile it to run under Solaris.
The first alternative (i.e., generating .NET IL) corresponds to writing a new "backend" for GCC. In many ways, .NET IL isn't very different from the assembly code used for any other processor, so generating .NET IL is analogous to porting GCC to the latest microprocessor. The other two alternatives correspond to

Java Assembly Language There is currently no standard syntax for representing Java bytecodes in human-readable form. Sun's javap utility provides one representation, but its output is complicated by code offsets and constant pool index numbers. Thus, a simpler syntax of coding in Java bytecodes was created.
Note: This is documentation on the syntax only. To understand what the operands mean and their usage in programming, look at the Java Virtual Machine Specification. Most of the documentation is beta, meaning some of it is out-of-date. Look for the book by Tim Lindholm and Frank Yellin (to be available sometime this summer by Addison-Wesley) for the latest documentation. Excerpts are available at the above link.

http://gcc.gnu.org/ml/gcc-patches/2002-04/msg01702.html

This patch adds SSAPRE for Tree-SSA. I'll work more on it after finals
(May 10th is when they end).

The only real issues are that it doesn't
1. Insert reload statements that conform to SIMPLE (easy to fix, will do
after finals). Rather than transform
e = a + b
into
T.1 = a + b
e = T.1

It does
e = T.1 = a + b

2. Keep the SSA form valid (easy to fix, will do after finals).


I'll also add strength reduction after finals, as it's very easy to do as
well.

I'll comment it better (though the routines correspond directly with the
algorithms in http://citeseer.nj.nec.com/correct/399268) after finals as
well.

I'll also probably split exprref's out of varrefs after finals as well.


When C++/Java support for tree-ssa is added/finished, we'll want to add
code to prevent hoisting into exception regions.

http://gcc.gnu.org/ml/gcc-patches/2002-02/msg02063.html

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]
[ast-optimizer-branch] Call graph for C (2/5)

* From: Sebastian Pop
* To: gcc-patches at gcc dot gnu dot org
* Cc: dnovillo at redhat dot com
* Date: Thu, 28 Feb 2002 20:07:27 +0000
* Subject: [ast-optimizer-branch] Call graph for C (2/5)

The following patch allows to print the call graph in an XML file
by passing the option -fdump-call-graph.

Seb.

http://axkit.org
http://www.hpl.hp.co.uk/people/bwm/rdf/jena/ http://rdf-filter.sourceforge.net/
http://injektilo.org/rdf/repat.html
http://sourceforge.net/projects/redfoot/
http://4suite.org/index.xhtml

So if I write a program that can only work legally with free software,
and distribute it via a click through license that asks for agreement
before the online web service delivers active parsed and ready to
executable data to the client for secure execution and output data that
can only be read legally by free software?


Also all transformations of the data are licensed in such a way like OpenCyc.


The data would be delivered in secure package signed and encrypted for the recipient.


If we limit the data to a such restrained, restrictive, recursive End
User Click-through license agreement and Terms of usage.



It is RIGHT WING, RADICAL and REVOLUTIONARY.



It is *R*-GPL.



Via its license of the output data a as declarative statement in a new
language, all transformations of that data could be a huge amount.



What makes it recursive is that the further your recursively output new
transformations of the existing data by a interpret function being
executed. This becomes memeoidal when it transfers itself via the
internet and is reinterpreted by a host of new people and modified.



Levels of RGPL(RGPL(RGPL (GPL)))=RGPL(3)


The Recursion level 3 is The RGPL exporting meta data, to a RGPLed
program that is outputing a different set of meta data, which is Read
by a Third instance that is RPL(3).



I think that a secure project on savannah would be the first step.


All source code that interfaces to gpl-ed gcc across the net will be
from registered users to a web service. That would prevent a
proliferation of implementations or availability of accessors to
sensitive data.



This idea builds on top of dotgnu and gcc, on top of emacs and all the
gcc and gnu projects could be web-service enabled.



That would allow the persistent, secure and networked execution over a
secure gnu data network.



Access to the meta data of the CVS, MAIL, Web Pages and coupled with a
GOOGLE API hooked up to a DAML indexer we could layout meta data via
VCG into DIA. Then PDF and Webalize it via Doxygen that is modified to
accept meta data from this data source in an XML data stream.



What do you think?


Wednesday, June 26, 2002


kenner@lab.ultra.nyu.edu (Richard Kenner) writes:
> In article Ronald Cole writes:
> >Oh, bullshit, Richard... The GPL plainly says "To protect your
> >rights, we need to make restrictions that forbid anyone to deny you
> >these rights or to ask you to surrender the rights."
> >
> >It seems clear that "discouraging free distribution" is equivalent
> >in effect to asking you to surrender the right to distribute.
>
> That's correct, though I'd use the word "waive" rather than
> "surrender".

Dewar posted that he feels that he is within the "letter *and the
spirit*" of the GPL when he *asks* his "wavefront" customers not to
redistribute that which he distributes. I, however, feel that by
doing so, he has violated the "spirit" of the GPL (since the quoted
clause is found in the preamble and doesn't appear to be present in
the enumerated sections).

> But the key point that this is *asking*, not *requiring*.

Still, the GPL says "To protect your rights, we need to make
restrictions that forbid anyone to ... *ask* you to surrender the
rights" and then fails to actually enumerate such a restriction (a
loophole which apparently both Stallman and Dewar use to discourage
"runaway snapshots").

Metaprogramming and
Free Availability of Sources
Two Challenges for Computing Today1
François-René Rideau
francoisrene.rideau@cnet.francetelecom.fr
http://www.tunes.org/~fare

CNET DTL/ASR (France Telecom) 2
38--40 rue du general Leclerc
92794 Issy Moulineaux Cedex 9, FRANCE
Abstract:

We introduce metaprogramming in a completely informal way, and sketch out a theory of it. We explain why it is a major stake for computing today, by considering the processes underlying software development. We show, from the same perspective, how metaprogramming is related to another challenge of computing, the free availability of the sources of software, and how these two phenomena naturally complement each other.

2.2 What is Metaprogramming? What is Reflection? Why are they so important?
Metaprogramming is the activity of manipulating programs that in turn manipulate programs. It is the most general technique whereby the programming activity can be automated, enhanced, and made to go where no programming has gone before.
Reflection is the ability of systems to know enough about themselves so as to dynamically metaprogram their own behavior, so as to adapt themselves to changing circumstances, so as to relieve programmers and administrators from so many tasks that currently need to be done manually.
These notions are explained in my article, Metaprogramming and Free Availability of Sources, that also explains why a reflective system must be Free Software. You may also consult the Reflection section of the TUNES Review subproject.
Reflection is important because it is the essential feature that allows for dynamic extension of system infrastructure. Without Reflection, you have to recompile a new system and reboot everytime you have to change your infrastructure, you must manually convert data when you extend the infrastructure, you cannot mix and match programs developed using different infrastructures, you cannot communicate and collaborate with people with different background. At the technical level, all these mean interruption of service, unreliability of service, denial of service, and unawareness of progress; but at the psycho-social level, lack of reflection also means that people will have to2

Re: gcc front-/backend (Was: Re: Binary archive issues) On Fri, 8 Aug 1997, Steffen Opel wrote:

> > No. It was just one of the ideas during a "brainstorm", and like many
> > other ideas, it might die silently, particularly since an influential
> > individual whose name I won't tell and who can veto anybody else's
> > decissions, seems to be opposed to this idea.
> Could you please at least summarize the arguments of this well unknown
> person?

Sigh... I shouldn't have brought this subject to this list :-).

OK, the person was Richard M. Stallman, the president of FSF.

His arguments were along these lines:

1. GCC is not a school example, GCC is developed to be useful.

2. The proposal to split the backend from the frontends serves little
purpose in terms of improving quality, it mostly makes GCC look somewhat
"neater".

1 & 2 clash, and 1 is more important, therefore forget about 2 and focus
on releasing 2.8.0.

My comment:

PLEASE don't comment on this subject. The people who are making decissions
are NOT on this list, so discussing this subject serves no purpose apart
from occupying network resources.

SourceForge.net: Project Info - RedShift

An Illustration of Perl Objects with C APIs An Illustration of Perl Objects with C APIs
To illustrate the use of C API's with Perl objects, a toy PPM-format image class is created and then extended (ToyPPM.pm and ExtendedToyPPM.pm). This child class runs just as fast as if it were part of a big C-language extension library, instead of being small and independent, depending only on ToyPPM.pm having provided a C API, and on Inline::C, to make use of it.
The image data is maintained in the PPM string itself, as sequence of (height x width x 3) one-byte integer color samples. These samples are exposed as a seemingly normal perl array, implemented by a Tie::Array/perltie-based helper subclass, created from a simple specification, by a generator of array-like classes. This sample array is then "folded" into an apparently normal 3D array of array of arrays, to provide a very natural interface to the image data (though a quite slow one).
ToyPPM.pm also provides a C API, namely a method which returns a string of C code. It defines macros which obtain a pointer to the string's memory, and provide direct and rapid access to the image bytes. This method was created by the same generator, also from a simple specification.

Berlin -- Home Berlin is a windowing system derived from Fresco, a powerful structured graphics toolkit originally based on InterViews. Berlin extends Fresco to the status of a full windowing system, in command of the video hardware (via GGI, SDL, DirectFB or GLUT) and processing user input directly rather than peering with a host windowing system. Additionally, Berlin's extensions include a rich drawing interface with multiple backends, an upgrade to modern CORBA standards, a new Unicode-capable text system, dynamic module loading, and many communication abstractions for connecting other processes to the server. It is developed entirely by volunteers on the internet, using free software, and released under the GNU Library General Public License

Pietrzak.Org update: 3/14/2001 I've found some projects working very nearly along the same lines as mine. The Software Development Foundation (SDS) is a system supporting analysis and manipulation of source code in multiple languages, based on an XML format for storage and inter-communication between SDS-compliant tools. Very much like my own design! Also, the Synopsis project describes itself as a code documentation tool with support for multiple languages; perhaps not as ambitious at SDS or my DCT, but more achievable... I may need to think about alliances with or support of these efforts. Along similar lines, the Introspector project, rather than create a new code analysis tool, seeks to augment the existing GCC compiler system to store the information needed for analysis tasks -- potentially leveraging the depth of GCC to create a system more powerful than any individual tool.
update: 7/1/2001 Well, it's been a while. I've had my brain stuck inside the guts of the C Macro preprocessor for some time now, but I've finally been able to get some things working. As such, I now have a web-page up that exercises some elements of the DCT. (Please be patient when viewing that page; it is running a hacked-up version of an experimental piece of software on an old x86 box with inadequate resources...)



API :
Take an object and :

Display Object in window
Display as HTML,XML,Graph,Table,Tree
Print to Text
Print to XML
Print as Source Code
Print Attribute List

http://docbook.sourceforge.net/index.html

Brent Fulgham [mailto:brent.fulgham@xpsystems.com] WROTE


I've been playing with Ciao prolog a bit, and so I build a Debian
package for it (so that it would co-exist with the rest of the
system in a more pleasing manner.)



Interested parties can get it from:



http://people.debian.org/~bfulgham/ciao




There is a binary i386 build, and the necessary *diff.gz file
for building on other architectures.



Enjoy!


http://www.w3.org/XML/2001/05/xmlschema-test-collection/MStocAll.htm
XML Schema Test Results -- Microsoft contributions, full report :)

RMUTT
http://www.schneertz.com/rmutt/

Welcome to Schema Mania Welcome to Schema Mania
Schema Mania is a place for people who like (or need, or are just good at) database designs. It's completely non-profit, dependent on the enthusiasm of its visitors and the talent of its contributors.

Purpose
Schema Mania was conceived as a repository of database designs. You'd be able to come here, browse for a database design in your "problem space". With luck, you'd find something at least similar to what you had in mind. You'd download it, adapt it to your needs, and be happy. www.schemamania.org would be a web of database designs, if you will.
But, a funny thing happened on the way to the forum. Much of the technology that Schema Mania needs is not ready for general use. What's available is nascent; the rest is missing. However valuable the concept might be, Schema Mania lacks both software and standards. It thus became part of Schema Mania's goal to bring together people of various disciplines, to help them find each other and create better tools.

RDF Interest Group IRC Scratchpad, last cranked at 2002-06-25 21:35
DAML mode for Xemacs and GNU emacs
posted by jhendler at 2002-06-25 21:35 ( +)

jhendler: installation instructions provided


Introspector Project collects Semantic Graphs from the GCC compiler via an XML interface and stores them in a Postgres Repository

Seth: Gcc bootstrap and Postgres interface underway
mdupont: The introspector project could produce RDF for into a DAML Index
mdupont: The Postgres database could be used for storing any old DAML
mdupont: Possible targeted compilers include GCC C++, Java, the DotGNU c# compiler and others including Perl,Python and Ruby
mdupont: Possible targeted compilers include GCC C++, Java, the DotGNU c# compiler and others
mdupont: The introspector patches the GCC to dump the semantic network of nodes from a give input program
mdupont: It uses a simple XML graph and attribute syntax and PIPEs this information to a perl program that parses it on the fly.
mdupont: Currently it does not support RDF, but with the help of the motivated and helpfull team at RDFIG we will be able to use DAML real soon now!!!
mdupont: The compiler if it does not support advanced code introspection like perl and ruby needs to be patched to support the Introspector interface.
mdupont: The compiler if it does not support advanced code introspection like perl and ruby needs to be patched to support the Introspector interface.
mdupont: There is currently no DTD or Schema available.
mdupont: But an example output can be found at http://introspector.sourceforge.net/xml/_tree_size.xml.gz
mdupont: That represents the XML dump of a function called tree_size
mdupont: Which returns the size of a tree object, which is a atomic symantic node of the compiler
mdupont: The purpose of the introspector is create an interface into the compiler to extract meta-data about a program
Seth: Kewl ... i want to to work for python

Relax NG Compact Syntax





Question about patents of the Intentional Programming tools from Microsoft, relates to the Introspector Project






mdupont:
Microsofts intentional programming project was cancelled. But the idea was to store all the semantic information in a repository and use transformations to create the various representations of the program on the screen.




RDF Interest Group IRC Scratchpad, last cranked at 2002-06-25 21:35 http://www.gnome.org/gnome-office/dia.shtml
posted by mdupont at 2002-06-25 20:10 ( )
mdupont: I would like to use DIA as UML editor for the introspector

W3C Semantic Web: Resource Description Framework (RDF) Interest Group Collaboration: IRC Tools and Logs
The RDF Interest Group IRC Scratchpad, a Web-based recommendation and annotation system, has been created by Edd Dumbill of XMLhack. The Scratchpad selectively logs comments made on the IRC channel, and is operated by an IRC bot 'dc_rdfig' (see chump instructions and source code release for more details). A full text search facility and RSS feed are available, as are the source XML files generated by the tool. Libby Miller has made an RDF-based search facility for the RSS blog data. ILRT's RDF at a Glance page shows one use of the RSS data.
Complete public logs of the discussions on the #rdfig channel are also available (in text, html and rdf flavours), thanks to Dave Beckett of ILRT. These logs are created by the 'logger' bot, which also offers search facilities (use: /msg logger help for more info).

W3C Semantic Web: Resource Description Framework (RDF) Interest Group #RDFIG - Internet Relay Chat (IRC) for Semantic Web Developers
Member of the RDF developer community can also augment mailing-list discussions with real time Internet chat tools. IRC is one such tool. There are a number of public IRC servers in existence: while W3C doesn't itself run such a server for general developer discussions, members of the Interest Group can often be be found on the Open Projects IRC Network (channel #rdfig for general RDF IG discussion).

XML developer news from XMLhack: by and for the XML community Welcome to xmlhack, a news site for XML developers. Our aim is to distill essential news, opinions, tips and issues concerning XML development.

Resource Description Framework (RDF) / W3C Semantic Web Activity The Resource Description Framework (RDF) integrates a variety of applications from library catalogs and world-wide directories to syndication and aggregation of news, software, and content to personal collections of music, photos, and events using XML as an interchange syntax. The RDF specifications provide a lightweight ontology system to support the exchange of knowledge on the Web.

Active Server Pages: see ASP Invocation:
The process of performing a method call on a CORBA object, which can be done without knowledge of the object's location on the network. Static invocation, which uses a client stub for the invocation and a server skeleton for the service being invoked, is used when the interface of the object is known at compile time. If the interface is not known at compile time, dynamic invocation must be used.

http://www.npac.syr.edu/projects/workingtutorials/shrideeparchive/documents/Glossary/comprehensive.htm
Introspection:
For those who are queasy about the idea of enforced naming conventions, explicit information about a class can be provided using the BeanInfo class. When a RAD Tool wants to find out about a JavaBean, it asks with the Introspector class by name, and if the matching BeanInfo is found the tool uses the names of the properties, events and methods defined inside that pre-packaged class.

FULLY BUZZWORD COMPLIANT


http://www.new-npac.org/users/fox/documents/rcihpccoct98/rcinpacpaperoct98.html




In this paper, we describe an approach to high performance computing which makes extensive use of commodity technologies. In particular, we exploit new Web technolgies such as XML, CORBA and COM based distributed objects and Java. The use of commodity hardware (workstation and PC based MPP's) and operating systems (UNIX, Linux and Windows NT) is relatively well established. We propose extending this strategy to the programming and runtime environments supporting developers and users of both parallel computers and large scale distributed systems. We suggest that this will allow one to build systems that combine the functionality and attractive user environments of modern enterprise systems with delivery of high performance in those application components that need it. Critical to our strategy is the observation that HPCC applications are very complex but typically only require high performance in parts of the problem. These parts are dominant when measured in terms of compute cycles or data-points but often a modest part of the problem if measured in terms of lines of code or other measures of implementation effort. Thus rather than building such systems heroically from scratch, we suggest starting with a modest performance but user friendly system and then selectively enhancing performance when needed. In particular, we view the emergent generation of distributed object and component technologies as crucial for encapsulating performance critical software in the form of reusable plug-and play modules. We review here commodity approaches to distributed objects by four major stakeholders: Java by Sun Microsystems, CORBA by Object Management Group, COM by Microsoft and XML by the World-Wide Web Consortium. Next, we formulate our suggested integration framework called Pragmatic Object Web in which we try to mix-and-match the best of Java, CORBA, COM and XML and to build a practical commodity based middleware and front-ends for today's high performance computing backends. Finally, we illustrate our approach on a few selected application domains such as WebHLA for Modeling and Simulation and Java Grande for Scientific and Engineering Computing.

2.4.3 Visual Metacomputing



The growing heterogeneous collection of components, developed by the Web / Commodity computing community, offers already now a powerful and continuously growing computational infrastructure of what we called DcciS - Distributed commodity computing and information System. However, due to the vast volume and multi-language multi-platform heterogeneity of such a repository, it is also becoming increasingly difficult to make the full use of the available power of this software. In our POW approach, we provide an efficient integration framework for several major software trends but the programmatic access at the POW middleware is still complex as it requires programming skills in several languages (C++, Java, XML) and distributed computing models (CORBA, RMI, DCOM). For the end users, integrators and rapid prototype developers, a more efficient approach can be offered via the visual programming techniques. Visual authoring frameworks such as Visual Basic for Windows GUI development, AVS/Khoros for scientific visualization, or UML based Rational Rose for Object Oriented Analysis and Design are successfully tested and enjoy growing popularity in the respective developer communities. Several visual authoring products appeared also recently on the Java developers market including Visual Studio, Visual Age for Java, JBuilder or J++.



HPC community has also explored visual programming in terms of custom prototypes such as HeNCE or CODE, or adaptation of commodity systems such as AVS. At NPAC, we are developing a Web based visual programming environment called WebFlow. Our current prototype summarized below and discussed in detail in Section 5.7 follows the 100% Java model and is currently being extended towards other POW components (CORBA, COM, WOM) as discussed in Sections 5.8 and 5.9.

http://clip.dia.fi.upm.es/Mail/ciao-users/0208.html

Richard,

Thanks for your tips, and thanks to all people on this list.
I am very excited about the resonance I get from the prolog community, the
gcc compiler community proper is not that interested in this project or any
project like it.
It seems that many people here have sympathy for the idea of extracting meta
data from c/c++ programs.

To answer your question,
>>Is there any redundant information?
yes there is much redundant information,
for example, I have one output file per input c file from the compiler,
plus one file per function that is compiled in each module, each time it is
declared c file or (in lines appear all over the place).

>>Could the information be put into a CDB file and Ciao's memory be
>>used as a cache?
I was hoping that that would work.

The set of all global information for a c program is not that large,
types and functions, these should be compressed down.

The files that I have are around 10-20 MB per source file for the
translations of the gcc sources themselves.

My original memory limitations were with gnu prolog, I must admit I have not
tried with ciao yet :(.

>>Is there information which is seldom needed, so it could be loaded on
>>demand?
the bodies of the functions can be loaded on demand,
the usage information of data types is not always needed.

I have switched the processing to Perl for a while, but I really did like
working with prolog,
also because of the ability of querying.

This weekend I will send out and update on the project with all newer source
code and example XML files

to the project page at http://sourceforge.net/projects/introspector/

Mike
I will be working from my mdupont777@yahoo.com account this weekend.

-----Original Message-----
From: Richard A. O'Keefe [mailto:ok@atlas.otago.ac.nz]
Sent: Donnerstag, 13. Dezember 2001 17:18
To: ciao-users@clip.dia.fi.upm.es
Subject: Re: Database and memory limitations


Manuel Carro <boris@aaron.ls.fi.upm.es> wrote:
I find it of interest that you are transforming xml datasets into
prolog
with xsl... specifically the reason your snippet caught my eye is
I'm about
to try out some previous work with Topic Navigation Maps with Prolog
(which
is new to me), well basically to see what fits well and what
doesn't.

I note that SWI Prolog comes with an SGML parser which supports XML,
including XML namespaces. This package has particular support for RDF.
I don't know whether Ciao's and SWI's licences are compatible, but it
might be worth looking into. I'm told that SWI Prolog is being used
to process 90MB RDF files.

I also note that Prolog is vastly more convenient for XML processing
than XSLT is. Prolog "Document Value Model" data structures for
representing XML are pretty much bound to be much cheaper than the
"Document Object Model" data structures used by most XSLT processors,
if you have a reasonably compact representation for text. (SWI Prolog
uses garbage-collected atoms for this.)

My own experience is that having Prolog, Scheme, and Haskell available
it'll take a gun pointed at my head or an extremely large bribe to make
me use XSLT for anything.

I suspect that the fundamental problem is with the representation that
is being generated as the output of the XSLT processing step.

Is there any redundant information?
Is there information which is seldom needed, so it could be loaded on
demand?
Could the information be put into a CDB file and Ciao's memory be
used as a cache?

Tuesday, June 25, 2002

Hole in GNU GPL?
http://slashdot.org/article.pl?sid=00/01/17/172203&mode=nested&tid=117

Intentional programming.
http://www.research.microsoft.com/scripts/pubs/view.asp?TR_ID=MSR-TR-95-52
http://www.research.microsoft.com/scripts/pubs/view.asp?PubID=229

See :
http://gcc.gnu.org/ml/gcc/2000-11/msg00790.html
http://eidola.org/references.shtml

Sunday, June 23, 2002

See PerlMonks and look for posts from mdupont.

Mike

I have not updated my blog in a while.

There is alot happening in the world of the introspector. I have been fighting licensing issues on many fronts.

Will make some more posts soon.

mike

Saturday, April 20, 2002

I have been hanging out at
irc.openprojects.net #dotgnu

Update 2:

1. Posted DOTGNU sql
DOTGNU
2. Started working on project directory structure

mike

UPDATE.

1. MySQL database server running on sourceforge.
2. VCG Code is being cleaned up a prepared for dissection.
3. Database model is simplified.
4. AST-OPTIMISER-Branch compiled and wil be the next target of introspector.
5. First experiments with the GCC Deparse routines successfull.

mike

Friday, April 05, 2002

Mailgate.ORG Web Server: comp.ai.shells

The Castor Project




Castor is an open source data binding framework for Java[tm]. It's basically the shortest path between Java objects, XML documents, SQL tables and LDAP directories. Castor provides Java to XML binding, Java to SQL/LDAP persistence, and then some more.

Thursday, March 28, 2002

I was thinking how to handle forward declarations of types.

There seems to be a macro DECL_EXTERNAL mentioned in the file :
http://gcc.gnu.org/ml/gcc-patches/2001-01/msg02251.html
/* Language-specific decl information. */
#define DECL_LANG_SPECIFIC(NODE) (DECL_CHECK (NODE)->decl.lang_specific)
In a VAR_DECL, nonzero means external reference:
do not allocate storage, and refer to a definition elsewhere.
In a FUNCTION_DECL, zero means that the function is declared auto
inside a function body.
These are externals.

Here is an example of the creation of a forward, it has to be looked into.
decl_specs = tree_cons (NULL_TREE, type, sc_spec); -- the declaration specifiers
decl = start_decl (synth_id_with_class_suffix (name, implementation_context), decl_specs, 1, NULL_TREE, NULL_TREE);
DECL_CONTEXT (decl) = NULL_TREE;
finish_decl (decl, expr, NULL_TREE);
I think that a forward is just a declaration, but without too much information.

mike

Monday, March 25, 2002

I was thinking about using numbers to represent types. If a type is derived from another, then the subclass would be a smaller number. An instance of a type would store its data in the fraction of the type. Two types could be checked for compatibility by bit-masking them.


You could then create a type library for application that would be all in a range of numbers.



An array of a type would be a special type that would contain the size and the element type as smaller data parts of the number. The same for a pointer type.


Sunday, March 24, 2002

[Mono-list] RE: Compilers emitting parsetrees - GCC's internal tree representation is unwieldy, but it's also quite
powerful. See: http://www.ncsa.uiuc.edu/~wendling/tree.html for a great
introduction to the structures involved. The back-end technically runs
in the same process as the front-end, but it does simply walk this tree
and emit code.
If you're looking for a human-readable form of this tree, try:
http://sourceforge.net/projects/introspector/.

Saturday, March 23, 2002

I have been reading about C#, DOTNET and DOTGNU.
C# has some interesting language structures
1. Delegates as Function Pointers
2. Reflection of types
3. Indexes as collection operators.
4. Pointers via unsafe code.

WikiWeb looks kinda interesting to use as the basis for a web page.

I am thinking about what I can do with only a computer without all the development tools installed.
what can I do from this computer here?

Java Script? Web Pages?
What about a function to decorate the trees into webpages?

We select the data we need to output.
First we split the tree into windows.
Then put all the nodes in a window into a XML Document.
Then we traverse that document and create windows or decorations for the nodes.
Then we creates HTML from the XML.
We push as much as possible onto style sheets and java script to display the data.

The orignal source code can be outputted, with the tree nodes interspersed between them.
A prototype can be build by hand of the possible output.

Tuesday, March 19, 2002

This is my first post from vacation, right now I am sitting in New Jersey at my grandmas house.

I saw an interesting operation in C# and was wondering how it works:
typeof (X).IsAssignableFrom

Turns out that it is also available in java, and mono implements it in reflection.h
Open Java supports it as well,
we will have to look into this interface.

This reminds me of andrew koenigs article on unnamed types in the c++ journal.

Now, how can you do a switch on the type of the node?
I was thinking about a exception based handler, how about a try block that gets the type and thows
the type as exception, the catch would have the code based on the object type to handle whatever gets thrown.

mike

Thursday, March 14, 2002

Open Source Quality Project Cil: An Infrastructure for C Progam Analysis and Transformation. Cil (C Intermediate Language) is a high-level representation along with a set of tools that permit easy analysis and source-to-source transformation of C programs.

WOW!
CIL (C Intermediate Language) is a high-level representation along with a set of tools that permit easy analysis and source-to-source transformation of C programs.

CIL is both lower-level than abstract-syntax trees, by clarifying ambiguous constructs and removing redundant ones, and also higher-level than typical intermediate languages designed for compilation, by maintaining types and a close relationship with the source program. The main advantage of CIL is that it compiles all valid C programs into a few core constructs with a very clean semantics. Also CIL has a syntax-directed type system that makes it easy to analyze and manipulate C programs. Furtheremore, the CIL front-end is able to process not only ANSI-C programs but also those using Microsoft C or GNU C extensions. If you do not use CIL and want instead to use just a C parser and analyze programs expressed as abstract-syntax trees then your analysis will have to handle a lot of ugly corners of the language (let alone the fact that parsing C itself is not a trivial task). See Section 13 for some examples of such extreme programs that CIL simplifies for you.

In essence, CIL is a highly-structured, ``clean'' subset of C. CIL features a reduced number of syntactic and conceptual forms. For example, all looping constructs are reduced to a single form, all function bodies are given explicit return statements, syntactic sugar like "->" is eliminated and function arguments with array types become pointers. 2

Wednesday, March 13, 2002

DTD Inquisitor 2 DTD Inquisitor is a program originally designed for checking the quality of DTDs. It attempts to locate problematic parts in DTDs. A large number of real world DTDs were downloaded and passed through the DTD Inquisitor. We particularly interested in the structure of the DTDs. The program has components of automata, graphs, some tree transveral and dynamic programming algorithms. These components are commonly used in applications manipulating the DTDs. DTD Inquisitor 2 is a reorganization of the previous distribution aimming to make those components more reusable.

Monday, March 11, 2002

Way COOL!
C PATCH: new abi rtti support
(class_type_node, get_identifier ("type_info"), 1);
if (flag_honor_std)
pop_namespace ();
! if (!new_abi_rtti_p ())
! {
! tinfo_decl_id = get_identifier ("__tf");
! tinfo_decl_type = build_function_type
(build_reference_type
(build_qualified_type
(type_info_type_node, TYPE_QUAL_CONST)),
void_list_node);

Monikers in the Bonobo Component System. We recently reimplemented and fully revamped the the Moniker support in Bonobo. This work has opened a wide range of possibilities: from unifying the object naming space, to provide better integration in the system. Note: on this document I have ommited exception environments handling for the sake of explaining the technology.

Bonobo Components: Architecture and Application Bonobo is the component technology that is part of the GNOME desktop environment. This paper discusses the architecture of Bonobo, and the way it can be used to write software. Also, it takes a look at the current state of Bonobo, and some of the future developments.

[Mono-list] GCC front-end? > On the roadmap, it seems to imply that y'all are working on a
> completely new "JIT" compiler for the CLI. I'm curious about why this
> is a better plan than writing a Gcc front-end for CLI bytecode.

I've been seriously thinking about doing that for quite a while now; since
long before Mono was announced. Technically I think it is a very good idea.

The main drawback, from my perspective, is that doing so would increase
the acceptance of .NET, which would be good for Microsoft, which would
increase world-wide income inequity, and concentrate too much power in
the hands of too few, and that would be bad.

I remember when I first heard of C ; it was from Bell Labs, same place
as C, and it had some nice technical improvements to C. But it was a
new language, and it didn't have the same degree of vendor support that
C had. Was it really worth writing code in this new language?
I first became convinced that C would become popular when I heard
that there was a GNU C implementation. The existence of a free
software implementation can be quite important to people's decisions
as to whether or not to adopt a particular piece of technology.

[Mono-list] GCC front-end? > On the roadmap, it seems to imply that y'all are working on a
> completely new "JIT" compiler for the CLI. I'm curious about why this
> is a better plan than writing a Gcc front-end for CLI bytecode.

I've been seriously thinking about doing that for quite a while now; since
long before Mono was announced. Technically I think it is a very good idea.

The main drawback, from my perspective, is that doing so would increase
the acceptance of .NET, which would be good for Microsoft, which would
increase world-wide income inequity, and concentrate too much power in
the hands of too few, and that would be bad.

I remember when I first heard of C ; it was from Bell Labs, same place
as C, and it had some nice technical improvements to C. But it was a
new language, and it didn't have the same degree of vendor support that
C had. Was it really worth writing code in this new language?
I first became convinced that C would become popular when I heard
that there was a GNU C implementation. The existence of a free
software implementation can be quite important to people's decisions
as to whether or not to adopt a particular piece of technology.

Question 4: Should someone work on a GCC front-end to C#?
I would love if someone does, and we would love to help anyone that takes on that task, but we do not have the time or expertise to build a C# compiler with the GCC engine. I find it a lot more fun personally to work on C# on a C# compiler, which has an intrinsic beauty.
We can provide help and assistance to anyone who would like to work on this task.
Question 5: Should someone make a GCC backend that will generate CIL images?
I would love to see a backend to GCC that generates CIL images. It would provide a ton of free compilers that would generate CIL code. This is something that people would want to look into anyways for Windows interoperation in the future.
Again, we would love to provide help and assistance to anyone interested in working in such a project.
Question 6: What about making a front-end to GCC that takes CIL images and generates native code?
I would love to see this, specially since GCC supports this same feature for Java Byte Codes. You could use the metadata library from Mono to read the byte codes (ie, this would be your "front-end") and generate the trees that get passed to the optimizer.
Ideally our implementation of the CLI will be available as a shared library that could be linked with your application as its runtime support.
Again, we would love to provide help and assistance to anyone interested in working2

Daniel Berlin - Re: libtool (was Re: [patch] releases.html) Zack Weinberg" writes:

> Yes, libbackend.so might be nice just to reduce disk consumption.
> (Didn't HJ have patches for this a long, long, time ago?) Wouldn't
> it cause arguments about opening loopholes for non-free front ends,
> though?

Can we get over this yet?

SGI already did it with the backend. They go directly from trees to
WHIRL, using gcc/g as a front end.

If someone wants to do it with the frontend, we aren't making it that
much easier by using a shared library.

Would any respectable company really want to try to make a compiler
when the legal status of doing a non-free frontend is pretty shaky?

And are we going to be able to really stop unrespectable companies
from doing it anyway?

Why should we not be able to further the usefulness and usability of a
free software project because someone, somewhere, might find it
slightly easier to integrate into something that may make it illegal,
or not illegal, for them to redistribute it.

--Dan

Daniel Berlin - Re: libtool (was Re: [patch] releases.html) Zack Weinberg" writes:

> Yes, libbackend.so might be nice just to reduce disk consumption.
> (Didn't HJ have patches for this a long, long, time ago?) Wouldn't
> it cause arguments about opening loopholes for non-free front ends,
> though?

Can we get over this yet?

SGI already did it with the backend. They go directly from trees to
WHIRL, using gcc/g as a front end.

If someone wants to do it with the frontend, we aren't making it that
much easier by using a shared library.

Would any respectable company really want to try to make a compiler
when the legal status of doing a non-free frontend is pretty shaky?

And are we going to be able to really stop unrespectable companies
from doing it anyway?

Why should we not be able to further the usefulness and usability of a
free software project because someone, somewhere, might find it
slightly easier to integrate into something that may make it illegal,
or not illegal, for them to redistribute it.

--Dan

Re: Converting the gcc backend to a library?
To: gcc at gcc dot gnu dot org
Subject: Re: Converting the gcc backend to a library?
From: Richard Stallman
Date: Mon, 17 Jan 2000 19:51:25 -0700 (MST)
Reply-to: rms at gnu dot org



Companies often try to make software non-free, and some would write
non-free add-ons to GCC if we let them. The reason we have free C
and Objective C support is because the companies which wrote these
front ends had no *feasible* way to use them without making them part
of GCC, where the GPL required them to be free. It is vital that we
preserve this situation.

Anything that makes it easier to use GCC back ends without GCC front
ends--or simply brings GCC a big step closer to a form that would make
such usage easy--would endanger our leverage for causing new front
ends to be free.

Because of this, the GNU Project will in all probability not install
such changes, should they be available. This statement reflects a
firm conclusion based on a decade of thought.

I ask anyone who would like to make such changes in GCC to please
contact me privately. I would like to talk with you about the ideas
you are interested in working on, to look at the magnitude of their
potential benefits, and consider other possible ways of achieving
them. Please think about the importance of future free front ends,

3.04 Why don't you use the LGPL for libraries?

Using GPL plus linking exception has several advantages. One is
that this makes it more convenient to reuse parts of the code
(possibly with modification) in GPL-licensed files.

Also, you can exclude native methods from the linking exception.
This is done in the license on the C# library, "pnetlib", which
is distributed under these terms:

The source code for the library is distributed under the
terms of the GNU General Public License, with the following
exception: if you link this library against your own
program, then you do not need to release the source code
for that program. However, any changes that you make to the
library itself, or to any native methods upon which the
library relies, must be re-distributed in accordance with
the terms of the GPL.

We call this the "GPL plus linking exception", which is also
used by the GNU Classpath project.

We aren't trying to restrict the use of the library by any kind of
commercial entities. However, a proprietary software company could
produce their own proprietary runtime engine that has
"enhanced" native method support of some kind. Under the terms
of the LGPL, they would be obligated to release the
declaration of the native method in

The DotGNU FAQ Licensing issues
~~~~~~~~~~~~~~~~

3.00 What software licenses does DotGNU use?

All official software development projects of the DotGNU
meta-project use the GNU General Public License (GNU GPL).
For Libraries which are intended to be linked with third-party
programs that may not have a GPL-compatible license, as a special
exception such linking is allowed.


3.01 Does the linking exception carry over to derivative works?

If you create a derivative work of pnetlib or any library which is
licensed as "GPL plus linking exception", then it is up to you
whether want the linking exception to carry over to your derivative
work. If you leave the exception in the text, then it applies to
your version.


3.02 What about programs which access each other through network
protocols. Is that a form of linking?

No. A GPL'd program can use any kind of webservice regardless
of how the webservice software is licensed, and GPL'd webservice
software can be used by any program regardless of that program's
license.

Monday, March 04, 2002

Python Docstring Processing System The purpose of the Python Docstring Processing System project is to create a standard, modular tool for extracting inline documentation from Python modules and converting it into useful formats, such as HTML, XML, and TeX.

Abstract Syntax Tree Optimizations - GNU Project - Free Software Foundation (FSF) Abstract Syntax Tree Optimizations
This page describes ongoing work to improve GCC's tree based optimizations. There is a branch in CVS called, ast-optimizer-branch, which is available to experiment with these optimizers. As these stabilize, they can be submitted for review onto the mainline development tree. Please contact Nathan Sidwell, , if you want to contribute.
Background & Rationale
GCC, in common with many other compilers, has more than one internal representation of a program. The main ones are trees and RTL. The trees, (or formally abstract syntax trees - ASTs) are generated during parsing, and are close to the source language semantics. The RTL is generated in the back end, and is close to the generated assembly code. Ideally, the AST would contain all the semantic information of the source program.
Historically, GCC generated RTL one statement at a time, so the AST did not stay around very long. This has changed with 'function at a time' compilation (Inliner), which both C and C frontends now implement. With the AST for complete functions, and the additional semantic information they contain, the opportunity for new optimizations presents itself.

The SUIF Compiler - SUIF 2 The SUIF 2 compiler infrastructure project is co-funded by DARPA and NSF. It is a new version of the SUIF compiler system, a free infrastructure designed to support collaborative research in optimizing and parallelizing compilers. It is currently in the beta test stage of development.

Encyclopedia.com - Results for Chomsky, Noam According to transformational grammar, every intelligible sentence conforms not only to grammatical rules peculiar to its particular language, but also to "deep structures, a universal grammar underlying all languages and corresponding to an innate capacity of the human brain. Chomsky and other linguists who have built on his work have formulated transformational rules, which transform a sentence with a given grammatical structure (e.g., "John saw Mary) into a sentence with a different grammatical structure but the same essential meaning ("Mary was seen by John). Transformational linguistics has been influential in psycholinguistics, particularly in the study of language acquisition by children.

20th WCP: Chomsky and Knowledge of Language ABSTRACT: The linguistic theory of Chomsky has changed the long, traditional way of studying language. The nature of knowledge, which is closely tied to human knowledge in general, makes it a logical step for Chomsky to generalize his theory to the study of the relation between language and the world-in particular, the study of truth and reference. But his theory has been controversial and his proposal of "innate ideas" has been resisted by some empiricists who characterize him as rationalist. In our view, these empiricists make a mistake. In the present paper we attend to his position regarding linguistics as a science of mind/brain, which we believe is an important aspect of his theory that has not been paid enough attention or understood by his opponents. In turn, this will help to clarify some of the confusions around his theory. Finally we will discuss some of the debatable issues based on the outlines we draw.

xrefer - Model In Chomsky's classic transformational model of grammar (Aspects of the Theory of Syntax, 1965), a few syntactic rules in the base of the grammar provided a syntactic deep structure which was then elaborated by processes known as transformations in order to produce a surface structure. Semantics or meaning was dependent on the deep structure, while phonology or sound was dependent on the surface structure. More recently, in Knowledge of Language (1986), Chomsky has envisaged a model of grammar which is less obviously directional, and which, as with a computer program, is composed of a series of modules, each of which is fairly simple in its general workings, but which becomes complex as it interacts with other modules.

Friday, March 01, 2002

Frequently Asked Questions about the GNU GPL - GNU Project - Free Software Foundation (FSF) Is there some way that I can GPL the output people get from use of my program? For example, if my program is used to develop hardware designs, can I require that these designs must be free?
In general this is legally impossible; copyright law does not give you any say in the use of the output people make from their data using your program. If the user uses your program to enter or convert his own data, the copyright on the output belongs to him, not you. More generally, when a program translates its input into some other form, the copyright status of the output inherits that of the input it was generated from.
So the only way you have a say in the use of the output is if substantial parts of the output are copied (more or less) from text in your program. For instance, part of the output of Bison (see above) would be covered by the GNU GPL, if we had not made an exception in this specific case.
You could artificially make a program copy certain text into its output even if there is no technical reason to do so. But if that copied text serves no practical purpose, the user could simply delete that text from the output and use only the rest. Then he would not have to obey the conditions on redistribution of the copied text.

Thanks to Tilly
http://www.gnu.org/licenses/gpl-faq.html#IfInterpreterIsGPL If a programming language interpreter is released under the GPL, does that mean programs written to be interpreted by it must be under GPL-compatible licenses?
When the interpreter just interprets a language, the answer is no. The interpreted program, to the interpreter, is just data; a free software license like the GPL, based on copyright law, cannot limit what data you use the interpreter on. You can run it on any data (interpreted program), any way you like, and there are no requirements about licensing that data to anyone.
However, when the interpreter is extended to provide "bindings" to other facilities (often, but not necessarily, libraries), the interpreted program is effectively linked to the facilities it uses through these bindings. So if these facilities are released under the GPL, the interpreted program that uses them must be released in a GPL-compatible way. The JNI or Java Native Interface is an example of such a facility; libraries that are accessed in this way are linked dynamically with the Java programs that call them.
Another similar and very common case is to provide libraries with the interpreter which are themselves interpreted. For instance, Perl comes with many Perl modules, and a Java implementation comes with many Java classes. These libraries and the programs that call them are always dynamically linked together.
A consequence is that if you choose to use GPL'd Perl modules or Java classes in your program, you must relea

XIG translates graph schemas defined in the form of UML class diagrams into the internal GXL graph schema format supported by GXL. GCF, on the other hand, is a framework which simplifies the development of GXL export and import tools, i.e. the development of translators from other formats to GXL and vice versa.

GXL (Graph eXchange Language) itself is designed to be a standard exchange format for graphs. GXL is an XML sublanguage and the syntax is given by a XML DTD (Document Type Definition). This exchange format offers an adaptable and flexible means to support interoperability between graph-based tools (cf. http://www.gupro.de/GXL/).
In particular, GXL was developed to enable interoperability between software reengineering tools and components, such as code extractors
(parsers), analyzers and visualizers. GXL allows software reengineers to combine single-purpose tools especially for parsing, source code extraction, architecture recovery, data flow analysis, pointer analysis, program slicing, query techniques, source code visualization, object recovery, restructuring, refactoring, remodularization etc. into a single powerful reengineering workbench.
There are two innovative features in GXL that make it well-suited to an exchange format for software data. One, the conceptual data model is a typed, attributed, directed graph. This is not to say that all software data ought to be manipulated as graphs, but rather that they can be exchanged as graphs. Two, it can be used to rep

public.kitware.com: Welcome The purpose of the GCC-XML extension is to generate an XML description of a C program from GCC's internal representation. Since XML is easy to parse, other development tools will be able to work with C programs without the burden of a complicated C parser

Many software professionals recognize the value of hybrid compiled/interpreted software environments. Compiled languages such as C and C provide efficiency, speed, and flexible data structure representation. Interpreted languages such as Tcl or Python remove the compile/link cycle from the development process, are indispensable for prototyping, and provide a multitude of packages for creating GUI's, numerics, and web access.
The creation of such hybrid environments is difficult. Typically C/C developers will manually add the interpreted language binding, or use semi-automatic tools such as SWIG. In some cases, custom wrapper generators have been created that are specialized to a particular system, such as the Visualization Toolkit (VTK). However, no fully automatic, general wrapping system has been available. As a result, mixed language systems are less common than might be expected, requiring excessive resources to develop.
Kitware, Inc. has developed an open-source wrapping package called CABLE (CABLE Automates Bindings for Language Extension). It is a tool designed to automatically generate bindings to C classes for use in interpreted languages. This system works in conjunction with GCC-XML, an extension to the GCC compiler. GCC-XML, also developed by Brad King at Kitware, is used to parse arbitrarily complex C code and then produce an XML representation. This representation is then processed by CABLE to generate wrappers. See the page on Running CABLE for more details.

Thursday, February 28, 2002

Genoa--GEN GEN is an application-generator that greatly simplifies the task of creating analysis tools for the C language. Analysis tools are specified in a high-level domain-specific language (DSL) that is designed to facilitate the task of specifying C analysis tools.
GEN is based on Cfront, the original C to C language translator developed by Bjarne Stroustrup, and GENOA, a language-independent parse-tree querying framework. It was originally developed at AT&T Bell Laboratories (now Lucent Technologies/Bell Laboratories) by Prem Devanbu (now at University of California, Davis) and Laura Eaves. Thanks to Lucent Technologies/Bell Labs innovations, it is now available for free download.
(Aria is a research prototype created by Prem Devanbu, David Rosenblum and Alex Wolf; the GEN tool has similar features. )
Several tools have been created with GEN , and come with the package; these can both be used directly, and as a springboard for other applications:

Purpose The purpose of this paper is to take a publicly available application which is well-used, develop a conceptual architecture based on its documentation and source code, and finally to produce the concrete architecture using CIA and identify any discrepancies

PBS: Portable Bookshelf PBS: The Portable Bookshelf - Introduction
The Software Bookshelf is a web-based paradigm for the presentation and navigation of information representing large software systems. The Portable Bookshelf (PBS) is one implementation of this concept. The PBS Toolkit is our set of tools for the generation of a PBS Bookshelf.

CIAO CIAO is a graph-based navigator that allows users to query and browse structural connections embedded in different software and document repositories. The architecture of CIAO enables the construction of successively more complex operators using a notion of virtual database pipelines. Currently, CIAO runs on SGI, Sun4, HP, and Solaris machines.

Project Home Page ArgoUML: A modelling tool for design using UML.

extende > faq eXtenDE initially started out life as an extensible development environment.
The first step is to decide how to best represent Java source code using XML. There are currently two schools of thought on this matter, JavaML and Eric Armstrong's (a development member of eXtenDE) proposal. Both were developed seperately, so they have much different goals, initially. Please see the Mailing List Archive for this lengthy discussion as it is currently unresolved.
The next step, once this decision has been made, is to write an editor, debugger, repository, etc. that can be plugged into a variety of existing IDE systems (Netbeans, jBuilder, jEdit, VisualAge, etc.).

JavaML Home Page (Greg J. Badros) The classical plain-text representation of source code is convenient for programmers but requires parsing to uncover the deep structure of the program. While sophisticated software tools parse source code to gain access to the program's structure, many lightweight programming aids such as grep rely instead on only the lexical structure of source code. I describe a new XML application that provides an alternative representation of Java source code. This XML-based representation, called JavaML, is more natural for tools and permits easy specification of numerous software-engineering analyses by leveraging the abundance of XML tools and techniques. A robust converter built with the Jikes Java compiler framework translates from the classical Java source code representation to JavaML, and an XSLT stylesheet converts from JavaML back into the classical textual form.

IST - GUPRO - Generische Umgebung zum PROgrammverstehen GUPRO ist ein gemeinsames Projekt mit der Aachener und Münchener Informatik-Service GmbH, Hamburg, und dem IBM Wissenschaftlichen Zentrum, Heidelberg. Es wird gefördert durch das Bundesministeriums für Bildung, Wissenschaft, Forschung und Technologie (BMBF), Förderprogramm Softwaretechnologien, Förderkennzahl 01 IS 504.

I have published this to the gcc mailling list
James Michael DuPont - Linkage of GPLed GCC to Closed Source via XML or Perl Dear GCC Developers,

For the past three years, I have been working on a
project to create a object oriented interface to the
GCC compiler, the GCC Node Introspector
(http://introspector.sourceforge.net/).

This turned from a c into a Perl project after
realising the power of Perl for handling strings and
complex data structures.
Currently I am using a modified version of c-dump.c
like done in CPPX
(http://swag.uwaterloo.ca/~cppx/doc/cppx/arch.html). I
output the tree nodes into a XML form that is very
similar to the tree dump, just with xml syntax. This
is streamed into a Perl program via popen and written
to a Postgres database.

Here is an article that I wrote for