Tag Archives: data mining

Data mining the Asita-Subhadra connection: Progress Report 1

Out of 126 clusters, I have gone through the first 20, and found that 7 are chiasmically meaningful, which is much higher than I thought.

Even more surprisingly, the clusters threw up what I previously gleaned by reading and manually comparing those parts.

Here’s a preview of what I have gotten thus far.

ProgressReport1 - clusters

I understand that might not be all that intelligible, but here’s what I gleaned from those clusters.

ProgressReport1 - gleanings

In terms of providing different perspectives of what is chiasmic, I think this method is holding up very well so far, and has generated quite a few matches that I did not pick up on while reading these two sections. (what I did pick up on is briefly presented here)

The most interesting would probably be the one found in cluster 4 – the Buddha was born to end birth. How strange that seems! Have you ever heard of someone doing something in order to stop doing that very same action?

Discussion with Eric brought to my mind how eating can be considered an example of this. We eat to starve off hunger so that we do not have to eat further, but, as Eric quickly points out, that is only a temporary relief. Perhaps that’s the importance of the Buddha’s endeavour – it is said to be the ultimate relief.

Attack plan for data mining the Buddhacarita

This is not meant to be comprehensive, but I need to do sufficient research to write a paper. And I am adhering to a principle I last learned, that I should not stop a project in the middle of what I expected to complete, for it would do me no good in understanding the pros and cons of using this method.

So the simple attack plan is this:

1. Use the current clusters generated from the analysis of Asita’s and Subhadra’s appearance in the Buddhacarita.

2. Continue to dig into all the clusters, which currently number 126.

3. Analyze them manually to see if the clustering revealed any interesting ways to understand structure of the Buddhacarita chiasmically.

4. Report in, both here and to my professors who are or might be interested in this work.

State-of-my-data-mining-exploits report

I haven’t devoted much time to trying to data-mine myself into some new understanding of the Buddhacarita, but I’ve to talk about it tomorrow, and present my work to-date to my class tomorrow, so here’s a brief review.

1. My data comes from the Chinese version of the Buddhacarita found in the digital version of the Taisho Shinshu Daizokyo as completed by CBETA. This text is called 佛所行暫.

2. Arthur Chen, a Master’s candidate at the Department of Religious Studies of Hsi Lai University did me a great favor by converting the Chinese text into a tab-delimited file that indicates the location and occurrence of each character in the entire 60,000-character-long text. There are 1997 unique characters.

3. In preparation for a clustering analysis of the file, I have removed about a 100 unique characters because they are not quite meaningful in and of themselves. These include prepositions and conjunctions, etc.

4. Clustering analysis is used because there cannot be supervised learning (the characters cannot be classified into known categories). In this case, I am trying to explore if there are some unknown – and unexplored – connections between the characters that my current manual reading and interpretation of the text.

5. There are two possible directions to take here. One is to input an attribute indicating the location (in which chapter, for example) of each character and use clustering to tell me which locations (chapters) are more related to one-another. The other is to use my current understanding of the structure of the text and select material to be clustered, thereby zooming into particular sections and using the results from the analysis to inform my manual reading.

Break from Buddhacarita and my thesis

After furiously posting some of my preliminary analysis of the Buddhacarita and the theoretical framework I am using for my thesis, I now have to take a break of about two weeks to write my midterm paper and prepare for a presentation.

Just something before I go off.

Conversations with my girlfriend on her thesis — Context matters

My girlfriend’s (Ah Wan) working on her B.A. thesis and she’s hoping to introduce a yet-unheard of Chan master and (what little seem to have remain) of his teachings. She was progressing well, but seemed to have met with some inertia of late, and she spoke of how she wasn’t experiencing the kind of joy I seem to derive from my research on the Buddhacarita.

I couldn’t help much, but today a professor facilitating her thesis writing class did. Upon reflection, she’s decided to go read up on the context this particular master lived in, and the about his teacher, who is reportedly more famous.

Sometimes what stops us is surprising. Ah Wan reported that she didn’t want to do anymore about the topic not because she did not care about the topic — she sincerely hopes to introduce this master’s thoughts — but the thought of having to go in a whole new direction put her off.

Yet at the very same time, if we learn to take bite-size bits of our big projects, we’ll quickly find that the latter are the very motivations for our progress.

My midterm paper — The way to enlightenment according to Sarvastivadin Abhidharma sources

Not much to say, but I think this should be the part I am most interested in of all the Abhidharmic analyses. I’m thinking of doing some tables to better understand Prof. Dhammajoti’s thoughts on this, as presented in the Sarvastivada Abhidharma. Please tell me if you might be interested.

Midterm presentation — Data Mining!! An exciting (possible) new approach to understanding Buddhist texts

I really should start a new blog to document my forays into this field, but basically I’m hoping to using some data mining techniques (which involves a combination of AI and statistics) to explore Buddhists text. More on this later, once I come up with the presentation.

For some inkling of what data mining might be about (interesting read even if you don’t understand anything about data mining from it anyway), read http://bits.blogs.nytimes.com/2013/10/28/spotting-romantic-relationships-on-facebook/?_r=0