• DB&B

Database And Brain

Daily Routine Makes A Difference
  • Home
  • About
  • Contact
  • Log in
  • November 2008
    Sun Mon Tue Wed Thu Fri Sat
     << <   > >>
                1
    2 3 4 5 6 7 8
    9 10 11 12 13 14 15
    16 17 18 19 20 21 22
    23 24 25 26 27 28 29
    30            
  • Search




  • Tag cloud

    ??? ???? ?????? agent bigtable c chunkio codecommander column conditional reflex consciousness device guy exa exabyte fdatasync fok fsync google google visualization google visualization api hippocampus inspiration interface ioadapter java jersey jni json kenichiro mogi linux masumi hattori motion chart pavlov posix resource rest restfull restlet savant syndrome sleep sony sync the user illusion unit vfs visualization web service web servise write cache yuji ikegaya

  • XML Feeds

    • RSS 2.0: Posts, Comments
    • Atom: Posts, Comments
    What is RSS?

Get Hadoop up and running without DNS

By MD on Sep 14, 2008 | In Database | Send feedback »

In this couple of days, I have tried to get Hadoop Word Count ruuning on my local cluster with 3 CentOS boxes.

Thanks to Running Hadoop On Ubuntu Linux (Multi-Node Cluster), 90% of the set up was easy as described.

But there are two problems that I had to waste my time.

1. RSA authentication with SSH

authorized_keys file has to be accessible only by the user. Don't forget to disable any access by any groups and others.

2. Host name resolution

examples of hosts files

master

::1 localhost6.localdomain6 localhost6
192.168.10.21 master * master has to have accessible IP address(not ::1 nor 127.0.0.1) by slaves
192.168.10.22 slave.yellow
192.168.10.23 slave.red

slave.yellow

127.0.0.1 slave.yellow localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.10.21 master
192.168.10.23 slave.red

slave.red

127.0.0.1 slave.red localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.10.21 master
192.168.10.22 slave.yellow

Be careful about these network settings, then the Work Count should run.

2008/9/14 tested Hadoop-0.18.0, JDK1.6.0_10, CentOS5.2

Essay "Katakana" Dictionary

By MD on Aug 3, 2008 | In Information | Send feedback »

Inspired by Haruki Murakami, the unique novel writer, we just launched a new blog, which is like an essay "Katakana" dictionary.

We have three types of words in Japanese. Kanji(??), Hiragana(????), and Katakana(????). Katakana represents sounds, mostly of foreign words like Coca Cola(?????).

One Katakana word a day. Then, one guy writes an essay related to the word. Interesting, isn't it.

Here's the dictionary. Enjoy!
http://ameblo.jp/teamharukist/

Object fundamentalism and Business

By MD on Aug 2, 2008 | In Information | Send feedback »

I love Jazz, and so to play trumpet.

I don't like to manage nor to be managed.

I love OpenSource mind like hippy's.

I love objects.

But the problem is, they are not so successful in Business.

Okay. Let me think. Why?

...

It is only me that enjoys?

That could be a reason.
But rather, it would be a mandatory to succeed.

The reason is, perhaps, the lack of attitude for audience/end users.

Playing trumpet is a part of my life, even myself.
So, that would be great if audience get high/relaxed,
but that's not the first thing.

Object? That's for my business.

Object fundamentalism? I don't think I am, but some think I am.

Cache, an ultimate enterprise object-capable database.
db4o, an open source object database for embedded system
Rational, needless to say

The Object Fundamentalism family got a certain level of success.
But it seemed to be limited so far.

Why?

Through other businesses, I realized that
a successful technology can provide end users with benefits directly.
Oracle, Google, VMWare, Salesforce.
And they have lots of believers who brings the bible to end users
to integrate, convince, pray and sell.
Sometimes, those believers put some benefits on top of it,
but even without it, the bible itself is valuable.

What about Object Fundamentalism family?

It depends on engineers.
That means a value is created *by engineers* for their customer.
So, the point would be to hire a great engineer rather than a product.

The Object Fundamentalism itself is worse than a piece of bread for end users.

How to improve the situation?

A product/service should have a clear benefit for end users, not (only) for engineers.

For end users, object words are as good as, with Japanese old saying, Buddha's words to a horse.

- I found an interesting story from "Essential Drucker".

The three stonecutters who were asked what they were doing. The first replied, "I am making a living." The second kept on hammering while he said, "I am doing the best job of stonecutting in the entire country." The third one looked up with a visionary gleam in his eyes and said, "I am building a cathedral."

The third man is, of course, the true "manager." The first man knows what he wants to get out of the work and manages to do so. He is likely to give a "fair day's work for a fair day's pay." It is the second man who is a problem. Workmanship is essential; without it no business can flourish; in fact, an organization becomes demoralized if it does not demand of its members that most scrupulous workmanship they are capable of. But there is always a danger that the true workman, the true professional, will believe that he is accomplishing something when in effect he is just polishing stones or collecting footnotes. Workmanship must be encouraged in the business enterprise. But it must always be related to the needs of the whole.

UNIQLOCK

By MD on Jul 25, 2008 | In Brain | Send feedback »

A bit late, but Enjoy!

BigTable(next generation database led by Google) 1

By MD on Jun 13, 2008 | In Database | Send feedback »

BigTable. It could be pretty big as it sounds. According to Jeffry Dean, who is a fellow at Google, the biggest one today is up to 4000TB, spanning over thousands of servers..., only a table!!!

Have you ever heard about BigTable? Unless you're a database vendor or a Google infrastructure freak, I'm afraid you haven't.

According to the paper titled Bigtable: A Distributed Storage System for Structured Data, it is like:

Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving).

Two points.

  • scale to a very large size: petabytes of data across thousands of commodity servers
  • both in terms of data size and latency requirements

How?

Performance matters

To reduce HDD seek time is a keen point to general performance of computer since I/O costs million times more than CPU's. So, it's wise to fetch as big chunk as possible at once.

Today, most of user files are getting larger and larger, ever larger. But still, a unit of HDD stays smaller, 512KB usually, and 512KB-4KB of filesystem on Linux.

Google File System, Google's underlying distributed filesystem, makes use of a huge chunk, 64MB in size.

The next thing to consider is layout. How data should be laid on a block? Contiguous data can be read/written from/to disk at once.

A database usually put one row on a contiguous space. So as long as you put all the data you require on a single record, you can get the best performance. Some databases provide another approach, column oriented.

BigTable is not a conventional table

It's more like a spreadsheet. And a map under the hood.

BigTable offers a new way both in performance and functionality. Next time, I will show you details.

Tags: bigtable, google
1 2 3 4 5 >>
  • Daily Routine Makes A Difference
  • Database is going to be..., like Brain. It is still on the long way, so I feel I stay the same. But I believe daily routine can make a difference to get there. This is a blog by Takenori Sato, an independent database consultant, leaving my footprints.

    • Recently
    • Archives
    • Categories
    • Latest comments
  • Categories

    • All
    • Brain
    • Database
      • db4o
      • Durability
      • Ease Of Use
      • Performance
    • Information
    • ???
blogging tool

©2008 by Takenori Sato | Contact | evoCamp skin | Credits: Blog Design | blog soft | cheap web hosting | adsense