Categories: Database, A.I., db4o, Durability, Ease Of Use, Performance
Server Energy Consumption Estimation Project
By MD on Aug 31, 2009 | In Database | Send feedback »
According to "The Economic Meltdown of Moore’s Law and the Green Data Center" presented by The Uptime Institute Inc. in 2007, this year, 2009 is when 3-years-electricity-cost will be the same amount as server cost. And in 2012, it is expected to be doubled.
What does this mean?
If you are a facility manager, you must be very sensitive to energy costs today. But what about you, as an IT manager who develops software? Probably not. For your business requirement does not include energy costs.
But soon, you will need to take it into account if the cost is rising as expected. It's easy to get the whole power consumption just by plug it into a measurement equipment, but doesn't tell which consumes how much at all.
Fortunately, there're many studies available. But unfortunately, no practical tool/project available today.
So I will get started my project next month, which is based on the study, "Full-System Power Analysis and Modeling for Server Environments". An overview of the project is illustrated below.

Months ago, I got 3 Dell-T105 cheap boxes with AMD Opteron Quad Core and 4GB RAMs. So they look convenient to begin with.
For the time being, the project is only for Linux. And the main performance counter to use is PerfSuite.
Basically, there won't be any technical problems, but the question is if such a user community can be built. I guess the project could be runnable by revenues through ads from vendors.
But at first, let's see my boxes can show reliable results.
Vizoo, YouTube For Graphs
By MD on May 22, 2009 | In Database | Send feedback »
Vizoo, YouTube for graphs, got TechCranched. I have led the db design from scrach over a year.
Not yet in English. But coming soon!
cSearch: to create, not to consume
By MD on May 18, 2009 | In Brain, Database | Send feedback »
Haruki Murakami is one of the most interesting writer in the world today.
As he often mentions, he likes Raymond Carver.
Once you read Ray's works, you never forget his taste. It is strange. In the begining, it looks normal, but in the end, it becomes totally scary world as if you got through to the other world.
Ray puts a little difference one by one. You can move on without paying too much attention. You can just put an ordinary image on the gap.
But if you do, you find it's hard to picture such a scene. So you have to imagine.
Then, in the end, you made up the story with bunch of images you had.
That is a blank or difference effect, I think. So you find something new whenever you read.
With this hypothesis in your mind, take a look at popular novels. They are filled with pictures, no blanks left.
You can enjoy for the first time, second?, third?, boring, boring, boring...
They are made to be consumed. While something with lots of blanks or differences is to be created in users.
Here, I see a significant difference. And it is a key, I think, to the coming world.
With such a concept, I made a search system, which will give you "How come? Ah-ha!" search experience. Available in 4 languages.
Examples are blogged here in English. Just try, and tell me what it is.
Database that tells "Chigai"
By MD on Apr 4, 2009 | In Brain, Database | Send feedback »
This world is changing. From the one "how to do" matters, to the other "what to do" does.
Then, problems people have are changing accordingly.
Answering to "what to do" is not something others can do or should do. But to make a decision on "what to do" makes sense when there's any change on its field.
"Chigai" is Japanese that means "difference". An interesting property of "Chigai" is not the difference itself, but the gap that tells A is different from B. So I could say, "Chigai" holds something new information that A and B do not hold.
A change is recognized as "Chigai". And then perceived or interpreted to get accepted.
With this "Chigai" in mind, to deal with the "what to do" era, Database that tells "Chigai" is making more sense. An interesting fact is that such a database can tell a new information that the database doesn't store.
Okay, so how we recognize "Chigai"? Some recent brain sciences answer to this question. By expecting what'll come next, and to which a reality is compared.
I love the concept "Chigai", and its interesting property that says a gap, 無(emptiness), holds the source of creativity.
Probabilistic Graph Model
By MD on Mar 29, 2009 | In Database | Send feedback »
This is rushed to complete for the submission to "Common Persistent Model Patterns for Performance and/or Scalability Optimization: Call for Submissions" by ODBMS.ORG.
Recognize how concepts are formed on the Web
By MD on Mar 23, 2009 | In Database | Send feedback »
This is just a brief idea to recognize how concepts are formed on the Web. It is not static, but dynamic. And it is shown in numbers, not in abstract way, to enjoy statistical techniques on it.
Invariant
At first, let me introduce how our brain recognizes things. You may want to take a look at this video by Jeff Hawkins to catch up the latest brain science. "Invariant", which is the term Jeff takes, is less changing, mostly static concept considered residing in our neocortex at levels.
Our brain translates things into these invariants. That's why we can tell you're human, so am I. Otherwise, every thing is different from each other. Then, your life will get wild.
"Invariant" *expects* what will happen next based on its context when actuall input is coming up. Emergency annunciator will get on if any differences are obserbed between the expectation and actuall inputs. Then, your consciousness gets involved to watch out.
It is natural to wonder if we refine Information Retrieval with such concepts.
Invariant representations for document
Let's consider document as usuall in IR.
What is "Invatiant" for document? As you can guess, and actually see in various researches about clustering, factoring, topic estimation, "Topic" looks the most natural choise to take.
Y = AX
Y: Document/Word(probability) matrix
X: Document/Topic(probability) matrix
A: Topic/Word(probability) matrix
Think about such a matrix X, which shows topical aspects of documents. You can get relax, don't need to get bothered here by the exact meaning of matrix. It is actually the equation used in GaP by Canny.
We'll get "Invatiant"s if we can find such A and X. Please not that it is just one way to get "Invatiant", and I think, is the best way to picture the idea.
Difference, Earthquake, Concept Map
We can apply this idea to Information Retrieval, not for search, but to see its statistical aspects of underlying document collection.
These will form X in the equation above.
Now you have "Invariant" representation by numbers for a query, a concept.
You can measure differences, deviations, means, whatever you like.
If we measure deviations shown in concepts like doing for "Earthquake", isn't it safe to say that similar measures mean they are close to each other geometically? Then, we'll get dynamic "Concept Map" on the web.
Document Similarity By Color
By MD on Feb 27, 2009 | In Database | Send feedback »
The way to get document similarity by color
It is the invention that I made an application for patent last week. That is the key, I think, that puts Information Retrieval one level up higher.
The first demo will be launched late March.
Background
How does Performance of Information Retrieval get measured? If you are not familier with "Precision" and "Recall", check them back at the Information Retrieval.
We have two major Information Retrieval tools. A database and a search engine.
A database provides higher precision and lower recall. That means you need to specify detailed search criteria to get what you need.
While a search engine does opposite, lower precision and higher recall. That means you're likely to get what you need, but need to find out among huge results.
Wait, Google is not bad like that.
True. So this lacks two practical contepts. Search results can be ordered, and search results can be paged.
Earlier you get what you need, higher precision you get. So a search engine usually provides higher precision and higher recall, it depends on Ranking, though.
It is such a great innovation, actually. And it has driven the adoption of search-engine-like-capability everywhere.
Problem
It's been more than 10 years since Google, Stanford actually, made an application for patent about PageRank. It was a birth of search engine era.
Thanks to the spread of such a great technology, web documents got exploded known as Information Explosion.
For example, you find document A at the 5th out of 100 results. One year later, the results get doubled. Then you find document A at the 10th out of 200 results. One more year later, it gets out of the first page. (I assume Ranking technology stays the same)
While, what about the size of a page? That is related to human capability rather than technology, so stays almost the same. 10 results or so per page. No?
Now we have a problem. The precision gets lower and lower.
To make it worse, one of the major search engine improvements is to recognize topics. With topic recognition capability, more data is coming in.
In fact, rather than recognizing more topics, to nail down, to personalize search results is what Google is doing.
In this way, I think, something that improves recognition of a page by human is required. Otherwise, we'll get even lower precisions without getting great benefits by topics, Onthology.
Solution
Then, you reach to my invention. Document Similarity By Color.
With this technology, you can understand topics that a document represents by color, and then can search for what you want like chasing one color.
I think this implicates a power shift to come. So, I call the phenomenon "Liberation Of Search".
Just my 2cents ...
Device Database Platform
By MD on Dec 20, 2008 | In Database | Send feedback »
What is the paramount of Unix for you?
For me, it's modularity. It's amazing, I can't believe, that the design has survived for more than 30 years, especially in the last decade.
There is a reason that makes me astonished.
I am the one among Java generation. I thought C is something fixed, in changes both of requirements and hardware. I mean, C is not something you can deal with changes. But it is not absolutely true. You'll see when you look at all those wide variety of adoptions. Especially, in the appearence of Linux.
Linux, what a great adoption of Unix.
Legend has it that the kernel developers were constantly tweaking the C code they wrote in the Linux kernel in order to control the 80x86 machine code that the GCC compiler was producing.
says Randall Hyde in his book, "WRITE GREAT CODE VOLUME 2: THINKING LOW-LEVEL, WRITING HIGH-LEVEL".
In general, Java guys are indifferent to low-level issues since they think it is the due of JVM or have never thought about it. Of cource, I used to be one of those. Such guys can count on one of the best possible implementation by making use of Java library written by Java developers, but how could JVM translate a code written by them into efficient bytecode? And then to efficient assembly for a target machine?
True. In this Virtualization era, you may wonder, if such an effort makes any difference. I agree. How many layers, abstractions, your application might have to get through? Ten?
Beside, you might argue quoting the phrase, "premature optimization is the root of all evil". But then, I will ask. Have you ever succeeded in tuning performance at last without upgrading hardware?
I have felt some kind of, how can I say, like you're in a cage like a bird. If you feel like that, you may want to forget about all those abstractions, then think at hardware-level from scratch.
Then, what came to me is the idea, Device Database Platform.
It is a modular database. Imagine VFS in Linux kernel.
There are a couple of reasons why a device database has to be modular.
A software on a device has to be optimized as possible as one can do. The work force cost is considered chieper than hardware cost brought by an increase of unit cost for inefficient code. So you need to get rid of unnecesarry parts, and replace inefficient module with another designed best. This requirement keeps device guys stay on RYO database simply because a commercial database lacks the ability for customization, which means quite expensive on a device.
It is easy to imagine that you can design more efficient algorithm for a specific usage. Besides, there are many CPU architectures and OS involved. So, as Linux lernel guys does, there are many rooms improved for every combination.
Actually, device guys have done that. So they write their own database. But it is also true, they want a database.
When I consider these conditions, and learn from Unix and Linux, what occurs to me is to design and write a modular database.
Get Hadoop up and running without DNS
By MD on Sep 14, 2008 | In Database | Send feedback »
In this couple of days, I have tried to get Hadoop Word Count ruuning on my local cluster with 3 CentOS boxes.
Thanks to Running Hadoop On Ubuntu Linux (Multi-Node Cluster), 90% of the set up was easy as described.
But there are two problems that I had to waste my time.
1. RSA authentication with SSH
authorized_keys file has to be accessible only by the user. Don't forget to disable any access by any groups and others.
2. Host name resolution
examples of hosts files
master
::1 localhost6.localdomain6 localhost6
192.168.10.21 master * master has to have accessible IP address(not ::1 nor 127.0.0.1) by slaves
192.168.10.22 slave.yellow
192.168.10.23 slave.redslave.yellow
127.0.0.1 slave.yellow localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.10.21 master
192.168.10.23 slave.redslave.red
127.0.0.1 slave.red localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.10.21 master
192.168.10.22 slave.yellow
Be careful about these network settings, then the Work Count should run.
2008/9/14 tested Hadoop-0.18.0, JDK1.6.0_10, CentOS5.2
BigTable(next generation database led by Google) 1
By MD on Jun 13, 2008 | In Database | 1 feedback »
BigTable. It could be pretty big as it sounds. According to Jeffry Dean, who is a fellow at Google, the biggest one today is up to 4000TB, spanning over thousands of servers..., only a table!!!
Have you ever heard about BigTable? Unless you're a database vendor or a Google infrastructure freak, I'm afraid you haven't.
According to the paper titled Bigtable: A Distributed Storage System for Structured Data, it is like:
Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving).
Two points.
- scale to a very large size: petabytes of data across thousands of commodity servers
- both in terms of data size and latency requirements
How?
Performance matters
To reduce HDD seek time is a keen point to general performance of computer since I/O costs million times more than CPU's. So, it's wise to fetch as big chunk as possible at once.
Today, most of user files are getting larger and larger, ever larger. But still, a unit of HDD stays smaller, 512KB usually, and 512KB-4KB of filesystem on Linux.
Google File System, Google's underlying distributed filesystem, makes use of a huge chunk, 64MB in size.
The next thing to consider is layout. How data should be laid on a block? Contiguous data can be read/written from/to disk at once.
A database usually put one row on a contiguous space. So as long as you put all the data you require on a single record, you can get the best performance. Some databases provide another approach, column oriented.
BigTable is not a conventional table
It's more like a spreadsheet. And a map under the hood.
BigTable offers a new way both in performance and functionality. Next time, I will show you details.
Google Visualization API On Your Site Example
By MD on Jun 7, 2008 | In Ease Of Use | 1 feedback »
This is a Google Visualization Gadget example that is hosted on my site.
SONY's sales&op_profits by segments since 1996. Enjoy!
Visualize Your Data with Google Visualization API
By MD on May 30, 2008 | In Ease Of Use | 1 feedback »
Today, here at Google I/O 2nd day, a new Google Visualization API was announced.
What's new?
1. events
2. gadget.draw()
Google Visualization Gadget supports selection events so that a developer can respond to end users.
With draw() method, a gadget can draw any tables only if a table supports pre-defined DataTable APIs.
Let's try the new draw() method with the JSON example.
Instead of writing tables with HTML tags, simply pass DataTable object to Table gadget.
Here's the code.
Code:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> | |
<html> | |
<head> | |
<script type="text/javascript" src="http://www.google.com/jsapi"></script> | |
<script type="text/javascript"> | |
| |
google.load("visualization", "1", {packages:["table"]}); | |
| |
var xhr; | |
try { xhr = new ActiveXObject('Msxml2.XMLHTTP'); } | |
catch (e) | |
{ | |
try { xhr = new ActiveXObject('Microsoft.XMLHTTP'); } | |
catch (e2) | |
{ | |
try { xhr = new XMLHttpRequest(); } | |
catch (e3) { xhr = false; } | |
} | |
} | |
| |
xhr.open("GET", DATASOURCE_URL, true); | |
xhr.setRequestHeader("Content-Type", "application/x-www-form-urlencoded"); | |
xhr.send(null); | |
| |
xhr.onreadystatechange = function() | |
{ | |
if(xhr.readyState == 4) | |
{ | |
if(xhr.status == 200){ | |
the_object = eval( "(" + xhr.responseText + ")" ); | |
handleQueryResponse(the_object.table); | |
}else{ | |
| |
} | |
} | |
}; | |
| |
// Query response handler function. | |
function handleQueryResponse(table) { | |
var data = new google.visualization.DataTable(); | |
| |
// convert to Google.DataTable | |
// column | |
for (var col = 0; col < table.cols.length; col++) { | |
data.addColumn('string', table.cols[col].label); | |
} | |
// row | |
for (var row = 0; row < table.rows.length; row++) { | |
data.addRow(); | |
for (var col = 0; col < table.cols.length; col++) { | |
data.setCell(row, col, table.rows[row][col].v); | |
} | |
} | |
| |
var vis_table = new google.visualization.Table(document.getElementById('table_div')); | |
vis_table.draw(data, {showRowNumber: false}); | |
| |
} | |
| |
</script> | |
</head> | |
| |
<body> | |
<div id="table_div">Loading...</div> | |
</body> | |
</html> |
Thanks a lot, Google Visualization Team! Now that our own data can be easily integrated with Gadget!
JSON Restful Web Service in Java
By MD on May 23, 2008 | In Ease Of Use | 5 feedbacks »
What kind of ways are available to get JSON Restful Web Service in Java?
Wait, what is JSON? Also Restful Web Service exactly?
Actually, I got to know them only a couple of weeks ago.
If you have tried XML Web Service with SOAP, you probably know how hard to get performance, and ease of use without complexity(especially about schema binding).
Then, JSON Restful Web Service must be what you will get interested in.
Let's go back to the first question. You can think of two providers. Sun and Apache. Here're options to go first.
In my opinion, they're just a topping on top of XML Web Service. That is the very basic mismatch. Feel free to try if you need to make sure
This is the one, with which you can get a true power of JSON Restful Web Service.
Take a look at Introduction on the site if you are not 100% sure about Restful Web Service(REST).
I'm going to show you an example to implement a service that returns DataTable compatible with one of Google Visualization API. The source of information is this thread in Google Visualization Discussion.
Code:
<CHUNK1> | |
google.visualization.Query.setResponse( | |
{ | |
requestId:'0', | |
status:'ok', | |
signature:'6173382439516707022', /* Changes when data changes */ | |
</CHUNK1> | |
So the timeseries gadget must download this JSON every so often and | |
check the new signature against it's own. If the signature is | |
different, it knows that the data on the "spreadsheet" has changed. | |
<CHUNK2> | |
table:{ | |
cols: | |
[ | |
{ | |
id:'A', | |
label:'Date', | |
type:'d', /* d=date */ | |
pattern:'M/d/yyyy' /* unique date pattern */ | |
}, | |
{ | |
id:'B', | |
label:'Budget', | |
type:'n', /* n=number */ | |
pattern:'#0.###############' /* unique number pattern */ | |
}, | |
{ | |
id:'C', | |
label:'Revenue', | |
type:'n', | |
pattern:'#0.###############' | |
}, | |
{ | |
id:'D', | |
label:'Movie', | |
type:'t', /* t=text */ | |
pattern:'' /* there is no text pattern */ | |
} | |
], | |
</CHUNK2> | |
Chunk2 shows the columns needed for a timeseries chart. (Date, | |
Value1, ..., ValueN, PopupText) | |
Note the type and pattern fields. d=date, n=number, t=text | |
<CHUNK3> | |
rows: | |
[ | |
[ | |
{ | |
v:new Date(1981,10,6), | |
f:'11/6/1981' | |
}, | |
{ | |
v:5000000.0, | |
f:'5000000' | |
}, | |
{ | |
v:4.2365581E7, | |
f:'42365581' | |
}, | |
{ | |
v:'Time Bandits' | |
} | |
], | |
</CHUNK3> |
To get this type of JSON Object is the goal. So how to do this?
In order to understand the following code, you may want to learn the tutorial how Restlet models the resource oriented Restfull world.
Restlet
Code:
import org.restlet.Application; | |
import org.restlet.Context; | |
import org.restlet.Restlet; | |
import org.restlet.Router; | |
| |
public class JSONApplication extends Application { | |
| |
public JSONApplication() { | |
super(); | |
} | |
| |
public JSONApplication(Context arg0) { | |
super(arg0); | |
} | |
| |
@Override | |
public Restlet createRoot() { | |
Router router = new Router(getContext()); | |
| |
router.attach("/table", JSONTableResource.class); | |
| |
return router; | |
} | |
| |
} |
Table Resource and JSON Representation
Code:
import org.json.JSONArray; | |
import org.json.JSONException; | |
import org.json.JSONObject; | |
import org.restlet.Context; | |
import org.restlet.data.CharacterSet; | |
import org.restlet.data.MediaType; | |
import org.restlet.data.Request; | |
import org.restlet.data.Response; | |
import org.restlet.data.Status; | |
import org.restlet.ext.json.JsonRepresentation; | |
import org.restlet.resource.Representation; | |
import org.restlet.resource.Resource; | |
import org.restlet.resource.ResourceException; | |
import org.restlet.resource.Variant; | |
| |
public class JSONTableResource extends Resource { | |
| |
private static final String JSON_NAME_TABLE = "table"; | |
| |
private static final String JSON_NAME_COLUMNS = "cols"; | |
private static final String JSON_NAME_COLUMNS_ID = "id"; | |
private static final String JSON_NAME_COLUMNS_LABEL = "label"; | |
private static final String JSON_NAME_COLUMNS_TYPE = "type"; | |
private static final String JSON_NAME_COLUMNS_PATTERN = "pattern"; | |
| |
private static final String JSON_NAME_ROWS = "rows"; | |
private static final String JSON_NAME_ROWS_V = "v"; | |
private static final String JSON_NAME_ROWS_F = "f"; | |
| |
public JSONTableResource(Context context, Request request, Response response) { | |
super(context, request, response); | |
| |
getVariants().add(new Variant(MediaType.APPLICATION_JSON)); | |
} | |
| |
@Override | |
public Representation represent(Variant variant) throws ResourceException { | |
JSONObject json = new JSONObject(); | |
| |
try { | |
| |
json.put("requestId", "0"); | |
json.put("status", "ok"); | |
json.put("signature", "6173382439516707022"); | |
| |
json.put(JSON_NAME_TABLE, this.createTable()); | |
| |
} catch (JSONException e) { | |
throw new ResourceException(Status.SERVER_ERROR_INTERNAL); | |
} | |
| |
JsonRepresentation jr = new JsonRepresentation(json); | |
| |
jr.setCharacterSet(CharacterSet.UTF_8); | |
| |
return jr; | |
} | |
| |
private JSONObject createTable() throws JSONException{ | |
JSONArray columns = new JSONArray(); | |
JSONArray rows = new JSONArray(); | |
JSONObject r_c = new JSONObject(); | |
r_c.put(JSON_NAME_COLUMNS, columns); | |
r_c.put(JSON_NAME_ROWS, rows); | |
| |
this.createColumns(columns); | |
this.createRows(rows); | |
| |
return r_c; | |
} | |
| |
private void createColumns(JSONArray columns) throws JSONException{ | |
| |
columns.put(this.createColumn("A", "Date", "d", "M/d/yyyy")); | |
columns.put(this.createColumn("B", "Budget", "n", "#0.###############")); | |
columns.put(this.createColumn("C", "Revenue", "n", "#0.###############")); | |
columns.put(this.createColumn("D", "Movie", "t", "")); | |
| |
} | |
| |
private void createRows(JSONArray rows) throws JSONException{ | |
| |
JSONArray row = new JSONArray(); | |
| |
row.put(this.createCell("new Date(1981,10,6)", "11/6/1981")); | |
row.put(this.createCell("5000000.0", "5000000")); | |
row.put(this.createCell("4.2365581E7", "42365581")); | |
row.put(this.createCell("Time Bandits", null)); | |
| |
rows.put(row); | |
| |
} | |
| |
private JSONObject createCell(String v, String f) throws JSONException{ | |
JSONObject jo = new JSONObject(); | |
jo.put(JSON_NAME_ROWS_V, v); | |
if(f != null) | |
jo.put(JSON_NAME_ROWS_F, f); | |
return jo; | |
} | |
| |
private JSONObject createColumn(String id, String label, String type, String pattern) throws JSONException{ | |
JSONObject jo = new JSONObject(); | |
jo.put(JSON_NAME_COLUMNS_ID, id); | |
jo.put(JSON_NAME_COLUMNS_LABEL, label); | |
jo.put(JSON_NAME_COLUMNS_TYPE, type); | |
jo.put(JSON_NAME_COLUMNS_PATTERN, pattern); | |
return jo; | |
} | |
| |
} |
web.xml
XML:
<?xml version="1.0" encoding="UTF-8"?> | |
<web-app id="WebApp_ID" version="2.4" | |
xmlns="http://java.sun.com/xml/ns/j2ee" | |
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" | |
xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee | |
http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd"> | |
<display-name>first steps servlet</display-name> | |
<!−− Application class name −−> | |
<context-param> | |
<param-name>org.restlet.application</param-name> | |
<param-value> | |
firstSteps.JSONApplication | |
</param-value> | |
</context-param> | |
| |
<!−− Restlet adapter −−> | |
<servlet> | |
<servlet-name>RestletServlet</servlet-name> | |
<servlet-class> | |
com.noelios.restlet.ext.servlet.ServerServlet | |
</servlet-class> | |
</servlet> | |
| |
<!−− Catch all requests −−> | |
<servlet-mapping> | |
<servlet-name>RestletServlet</servlet-name> | |
<url-pattern>/json/*</url-pattern> | |
</servlet-mapping> | |
</web-app> |
You can get JSON Object in Java here.
Here's the JSON texts you'll get.
Code:
{"status":"ok","requestId":"0", | |
"table": | |
{"cols":[ | |
{"id":"A","pattern":"M/d/yyyy","label":"Date","type":"d"}, | |
{"id":"B","pattern":"#0.###############","label":"Budget","type":"n"}, | |
{"id":"C","pattern":"#0.###############","label":"Revenue","type":"n"}, | |
{"id":"D","pattern":"","label":"Movie","type":"t"} | |
], | |
"rows":[ | |
[ | |
{"f":"11/6/1981","v":"new Date(1981,10,6)"}, | |
{"f":"5000000","v":"5000000.0"}, | |
{"f":"42365581","v":"4.2365581E7"}, | |
{"v":"Time Bandits"} | |
] | |
] | |
}, | |
"signature":"6173382439516707022"} |
Then, you can call the service, get DataTable, then process to visualize. Here's an example to show a simple table in Ajax.
Code:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> | |
<html> | |
<head> | |
<script type="text/javascript"> | |
| |
var xhr; | |
try { xhr = new ActiveXObject('Msxml2.XMLHTTP'); } | |
catch (e) | |
{ | |
try { xhr = new ActiveXObject('Microsoft.XMLHTTP'); } | |
catch (e2) | |
{ | |
try { xhr = new XMLHttpRequest(); } | |
catch (e3) { xhr = false; } | |
} | |
} | |
| |
xhr.open("GET", JSON_RESTFULL_WEB_SERVICE_URL, true); | |
xhr.setRequestHeader("Content-Type", "application/x-www-form-urlencoded"); | |
xhr.send(null); | |
| |
xhr.onreadystatechange = function() | |
{ | |
if(xhr.readyState == 4) | |
{ | |
if(xhr.status == 200){ | |
the_object = eval( "(" + xhr.responseText + ")" ); | |
handleQueryResponse(the_object.table); | |
}else{ | |
| |
} | |
} | |
}; | |
| |
// Query response handler function. | |
function handleQueryResponse(table) { | |
| |
var html = []; | |
html.push('<table border="1">'); | |
| |
// Header row | |
html.push('<tr><th>Seq</th>'); | |
for (var col = 0; col < table.cols.length; col++) { | |
html.push('<th>' + escapeHtml(table.cols[col].label) + '</th>'); | |
} | |
html.push('</tr>'); | |
| |
for (var row = 0; row < table.rows.length; row++) { | |
html.push('<tr><td align="right">' + (row + 1) + '</td>'); | |
for (var col = 0; col < table.cols.length; col++) { | |
html.push(table.cols[col].type == 'number' ? '<td align="right">' : '<td>'); | |
html.push(escapeHtml(table.rows[row][col].f)); | |
html.push('</td>'); | |
} | |
html.push('</tr>'); | |
} | |
html.push('</table>'); | |
| |
document.getElementById('tablediv').innerHTML = html.join(''); | |
} | |
| |
function escapeHtml(text) { | |
if (text == null) | |
return ''; | |
| |
return text.replace(/&/g, '&') | |
.replace(/</g, '<') | |
.replace(/>/g, '>') | |
.replace(/"/g, '"'); | |
} | |
| |
</script> | |
</head> | |
| |
<body> | |
<div id="tablediv">Loading...</div> | |
</body> | |
</html> |
Simple, Fast, and Flexible. Experience JSON Restful Web Service!
REST リソースオリエンテッドな考え方
By MD on May 20, 2008 | In Database | Send feedback »
さて。
"REST"といえば?
今しばし考えてみてください。どのようなイメージを持っていますか?
- WebServiceが面倒くさいし複雑だから、端折って簡単にしたやつ?
- Ruby On Railsみたいなの?
- Ajax?
- Googleっぽい、なんでもURLでアクセスしちゃうやつ?
どうもよく分かっていないので、
まず用語の定義を探ってみましょう。
Representational State Transfer
Representational state transfer (REST) is a style of software architecture for distributed hypermedia systems such as the World Wide Web. The terms “representational state transfer” and “REST” were introduced in 2000 in the doctoral dissertation of Roy Fielding,[1] one of the principal authors of the Hypertext Transfer Protocol (HTTP) specification. The terms have since come into widespread use in the networking community.
Wikipediaではこのように紹介されています。何やら小難しいことが書いてありますが、どうやらHTTPの父が2000年に発表したアイデアのようです。
さらにこのRESTのアイデアを支えるコンセプト、Resourceについての説明が続きます。
REST's central principle: resources
An important concept in REST is the existence of resources (sources of specific information), each of which can be referred to using a global identifier (a URI).
...
For example, a resource which is a circle may accept and return a representation which specifies a center point and radius, formatted in SVG, but may also accept and return a representation which specifies any three distinct points along the curve as a comma-separated list.
「円」というのがリソースの例として挙がっていますが、”リソース”というのは、あんなものが欲しい、こんなものが欲しい、なんでもかんでもリソースと捉えることができます。問題は、それをどうやって表現するか、ですね。そこで、その表現をRepresentationと呼びます。
「円」のリソースの例で言うと、Representationとしてこんなものが挙がっています。
- 中心点と半径
- SVGフォーマット
- CSVファイル(円上の任意の3点を1レコードとする)
「円」が例に挙がっちゃったので若干困りましたが、こんな風に、リソースを、いろんな型で取れるわけです。
それで?
それでは、現実の世界にある、具体的な例を挙げてみましょう。Google Financeでトヨタ自動車という”リソース”を見てみるとこうなっています。
http://finance.google.com/finance?q=NYSE%3ATM
いろいろいじってみてみましょう。
このページでは、トヨタ自動車に関する情報を一挙にまとめています。このように、リソースは、いろんなリソースからできていることが多いです。ここ、大事です。
リソースは、いろんなリソースからできている。
役員データ、現在の株価、株価の推移、関連ニュース、どれもリソースであり、さらにそれらも、いろんなリソースからできています。
ここでいうリソースには、データベースにアクセスして取り出すデータも含まれますが、同じようにリソースとして捉えちゃいます。
つまり、リソースオリエンテッドなわけです。
これって何がすごいの?
さっきWikipediaで見たような定義自体がすごいとは思いませんが、このリソースオリエンテッドな発想、インターネットの揺籃期を生き抜いて、台頭してきていることがすごいんです。
・分かりやすい。使いやすい。
・スケーラビリティがある。
ユーザーから見ても、技術面から見ても、大変優れているわけです。
RPCではなく、リソースの交換。
ウェブの世界では、RESTfullでいきましょう。
The Challenge in the record-all-era
By MD on Apr 24, 2008 | In Database | Send feedback »
My new column got up today, titled "The Challenge in the record-all-era".
Do you know how fast the recorded data by human beings is growing? Here're two basic questions.
- How many data, in size, have human beings recorded up to 20th century?
- What about today?
You can find a clue about the estimate on the 1st question done by Berkeley, here at Wikipedia.
It says:
Earlier Berkeley studies estimated that by the end of 1999, the sum of human-produced information (including all audio, video recordings and text/books) was about 12 exabytes of data.
The article, "So much data, relatively little space", answers to the 2nd question.
The previous best estimate came from researchers at the University of California, Berkeley, who totaled the globe's information production at 5 exabytes in 2003.
And the latest one:
Add it all up and IDC determined that the world generated 161 billion gigabytes — 161 exabytes — of digital information last year.
If IDC tracked original data only, its result would have been 40 exabytes.
This article is published in 2007, so it is about 2006. Recently, it grew by 200% every year!
2005 is, perhaps, the 1st year that more data than the whole up to 20th centuries is created in 1 year. Then it has been grown. Today, it would be 100x more(40exa in 2006, 80exa in 2007, 160exa in 2008).
So far, lots of topics about HDD, FileSystem, any others backing database technology appeared on my column. Through their challenges and directions, what I found is the fight against the fast growing data.
But, the purpose of the database is not storing data, but retrieving data. So, we need to ask, "how can I get what I want when I need?".
Today, the only entry point available is a search by keywords. Then, you traverse on the links. But it's an island, structured vertically. So there are many people collectiong information horizontally through the web by hand.
For example, SONY and Nintendo are appearently related in game. But how could you know that through links?
What we need tomorrow is something like neuron networks, which represents logical/dynamic relationship of data/documents. For example, in our brain, a route from neuron A to neuronB is more active, the cost gets cheaper. Otherwise, it gets higher. It's something like this, not about static structure, but dynamic one.
It is the challenge in the record-all-era. You'll find usefull information through relationships.
Some are already up and running. It's coming soon.
Googleが世界の電力を食いつぶす日
By MD on Apr 22, 2008 | In Database, 日本語 | Send feedback »
今日4月22日はアースデー。ということで、このままの勢いでインターネット上にデータがあふれるとしたら、一体いつGoogleは世界の電力を食べつくしてしまうのだろう?、というあり得なさそうで、でもひょっとしたらあるかも、という計算をしてみました。
まずは、世界の総発電量を確認。
電気事業連合会発行の、世界の電気・日本の電気によると、世界の電力消費量は、2003年に15兆kwh(キロワットアワー)だそうです。
さらに今後の予測はこんな感じ。
さて、それでは一体Googleはどれだけの電力を年間に消費しているのでしょうか?これは少し計算をしてみなければならないので、Googleを支える技術からデータと計算を拝借してみます。
2004年の上場時に5万台前後、2007年時点では50万台前後という、う・わ・さ、があるそうです。どちらも梅田氏の情報らしく、公開資料からかいま見えるGoogleのコンピュータシステムと同氏のブログエントリー、Googleの「情報発電所」はいま何台?、から推測できます。
そして、1台のサーバーが消費する電力を120w、24時間稼動、さらに冷房等で約半分の追加電力消費があるとすると、1台のサーバーが年間に消費する電力量は以下のようになります。
120w * 24 * 1.5 * 365 = 約1500kwh
それでは50万台では?
1500kwh * 500,000 = 788,400,000kwh
約8000億kwh!!!
2003年に15兆kwh、2010年の電力消費量予測が20兆kwhだと考えると、2007年は17兆kwhぐらいだと思われる。だとすると、2007年における世界の電力消費量に占めるGoogleのシステム消費量の割合は、
800 / 17,000 * 100 = 4.7%
5%!?
計算間違いなのを祈る・・・。さて、それではデータの増加量から今後のサーバーの増加量を予測し、既存のテクノロジのままではいつGoogleが世界の電力を食いつぶしてしまうのか計算してみる。
So much data, relatively little spaceによると、世界のデータ量は、2003年に5エクサバイト(exaは10億ギガ)、同様の測定法でIDCのデータを修正すると2006年には40エクサバイトだったそうだ。線形(比例して)で増加していないので単純に比較はできないが、サーバーが3年間で10倍に増えたのに対して、データは同じく3年間で8倍。実際には集計に含まれていないデータのことなどを考えると、Googleサーバーの伸びとデータの伸びはほぼ同じようだと言っていいだろう。
IDCによると、これは少し集計法が異なるのだが、2007年の161エクサバイトから、2010年には988エクサバイトになるというので、次の3年間で6倍になる予測だ。となると、Googleのサーバーは300万台、電力消費量は約5兆kwh!!!となる。2010年の世界の電力量見通しは、約20兆kwh。ということで、25%がGoogleに食われることになる。
いつ食いつぶすか?これ以上の予測は根拠が薄いので難しいが、少なくとも既に相当の割合を消費していて、消費量が供給量の伸びより大幅に大きいことを考えると、2015年から2020年には、電力クライシス、またはGoogle電力なるものが登場していても、不思議はない。
*でも普通に考えておかしい。電力会社の総売上の5%をGoogleが払っているとはとても思えない。サーバーは常に最大電力を消費しないし、平均10%ぐらいだとしたら、0.5%とかになるのかも。
When you record your whole life
By MD on Apr 7, 2008 | In Brain, Database | Send feedback »
Suppose you can record your whole life, what happens to you, and in the world?
Exabyte is equal to a million tera bytes, or a billion giga bytes. "エクサバイト" is a Japanese word of Exabyte, and also a title of SF novel written by Masumi Hattori.
It is a story around 2025. A storage device called "Unit" is appeared on this planet, which can be embedded into your head to record what you see, your whole life.
In this book, an interesting research result is refered. "all data ever recorded by human beings is only 12 exabytes, according to California University".
Where's the source? Here about "Exabyte" in the wikipedia, some clues are written with exact references. And actually, there is the same estimate.
Earlier Berkeley studies estimated that by the end of 1999, the sum of human-produced information (including all audio, video recordings and text/books) was about 12 exabytes of data.
Then, you can find the other estimate.
International Data Corporation estimates that approximately 160 exabytes of digital information were created, captured, and replicated worldwide in 2006.
No way! 10x more in only one year!?
OK, so how big storage is required to record one's whole life? Say, 100 years.
Down at "Exaflood", it says:
One exabyte is the equivalent of about 50,000 years of DVD quality video.
To record your whole life, 1/500 exabytes is enough. When you think about the growth of technology, you couldn't say that "Unit" is completely unrealistic.
Do you need so much information at all? I don't.
I just got scared of the fact. I think we need Maxwell's Devil, who can prevent entropy from swelling. Otherwise, otherwise...
ChunkIO(Clustered Reads/Writes)
By MD on Mar 12, 2008 | In Performance, db4o | Send feedback »
We're just having a db4o global conference in Berlin.
Here's what I have suggested before and just desided to work on this year, ChunkIO(Clustered Reads/Writes).
- Who is the target?
device guys- What is the purpose of chunkIO?
By providing a way to control the number of IOs,
to help users achieve reliable responses on serious operations.Also, some internal operations like startup should be improved.
- Why is this required?
Performance is observed in many ways. Usually "Maximum" is often mentioned.
"Maximum" performance can be improved by bufferings. But "Minimum" are not.
So it is not considered reliable for device guys.
Then, a way to improve "Minimum" performance is must.
It is chunkIO that improves "Minimum" performance.- How does it work?
"Objects updated in a transaction are written/read all at once"
"The behavior can be configured: NONE, CHUNK, etc"
"The behavior is pluggable"
This should boost performance considerably, but is about deep in the core, so must be quite hard to do. Yet, I know it worths a lot to work on it.
This project will be opened under CodeCommander.
I will notify here if it gets up. I will appreciate any feedbacks!
嘘をつくコンピュータ2
By MD on Mar 5, 2008 | In Durability, 日本語 | Send feedback »
明日の掲載に間に合うようにと、「嘘をつくコンピュータ2」を仕上げて提出したところ、「嘘をつくコンピュータ」すごい人気でしたとのこと。私には思いつかなかった過激なタイトルは編集者の方のアイデアです。
謎解きとなるパート2、お楽しみに!
(こちらに掲載されました)
fdatasync()でデータベースの性能を向上させる
By MD on Mar 4, 2008 | In db4o, 日本語 | Send feedback »
データベースの更新(追加・更新・削除)性能は、コミットの性能に大きく依存するといって過言ではありません。これはdb4oも例外ではありません。というのも、今日のコンピューターでは、いろんな層でバッファリングされていますが、コミット時にだけ、確実にハードディスクにアクセスしなければならないタイミングがあるからです。(どうしてハードディスクへのアクセスがそんなに問題かはこちらのコラムを参照)
抜本的にデータベースの性能をアップするのは大変な作業ですが、データベースをそのままで、このコミット性能を向上させる方法があります。
POSIX互換のOSでは、fsync()だけでなく、fdatasync()も提供されています。これは、ファイルの更新日時などのメタデータを書き込まないようになっています。通常デフォルトではデータベースからの呼び出しはfsync()になるので、それをこのfdatasync()に変えるだけで、通常10-20%程度の性能向上が得られます。
このテクニックはdb4oにも応用可能です。db4oはIoAdapterというアダプターでI/O操作を抽象化していますので、ここからJNIでPOSIXのfdatasync()を呼び出せばいいのです。
以下は簡単なサンプルです。IoAdapterについてはこちらにチュートリアルがあります。
Code:
#include <jni.h> | |
#include "LinuxIoAdapter.h" | |
#include <stdio.h> | |
#include <sys/types.h> | |
#include <sys/stat.h> | |
#include <fcntl.h> | |
| |
/* | |
* Class: LinuxIoAdapter | |
* Method: openFile | |
* Signature: (Ljava/lang/String;ZJ)J | |
*/ | |
JNIEXPORT jint JNICALL Java_LinuxIoAdapter_openFile | |
(JNIEnv *env, jclass jc, jstring path, jboolean lockfile, jlong initiallength) | |
{ | |
printf("openFile\n"); | |
const jbyte *filename = (*env)->GetStringUTFChars(env, path, NULL); | |
| |
if(filename == NULL){ | |
return 0; // OutOfMemoryError or NULL value | |
} | |
| |
int fd = open(filename, O_RDWR|O_CREAT, 0644); | |
| |
if(fd < 0){ | |
printf("open failed\n"); | |
} | |
| |
(*env)->ReleaseStringChars(env, path, NULL); | |
| |
return fd; | |
| |
} | |
| |
/* | |
* Class: LinuxIoAdapter | |
* Method: closeFile | |
* Signature: (J)V | |
*/ | |
JNIEXPORT void JNICALL Java_LinuxIoAdapter_closeFile | |
(JNIEnv * env, jclass jc, jint handle) | |
{ | |
printf("closeFile\n"); | |
if(close(handle) < 0){ | |
printf("close failed\n"); | |
} | |
| |
} | |
| |
/* | |
* Class: LinuxIoAdapter | |
* Method: getLength | |
* Signature: (J)J | |
*/ | |
JNIEXPORT jlong JNICALL Java_LinuxIoAdapter_getLength | |
(JNIEnv *env, jclass jc, jint handle) | |
{ | |
struct stat st; | |
fstat(handle, &st); | |
return st.st_size; | |
} | |
| |
/* | |
* Class: LinuxIoAdapter | |
* Method: read | |
* Signature: (J[BI)I | |
*/ | |
JNIEXPORT jint JNICALL Java_LinuxIoAdapter_read | |
(JNIEnv *env, jclass jc, jint handle, jbyteArray bytes, jint length) | |
{ | |
| |
jbyte buf[length]; | |
| |
ssize_t n = read(handle, &buf, length); | |
| |
if(n != length) | |
printf("read failed: %i->%i", length, n); | |
| |
(*env)->SetByteArrayRegion(env, bytes, 0, length, buf); | |
| |
return n; | |
} | |
| |
/* | |
* Class: LinuxIoAdapter | |
* Method: seek | |
* Signature: (JJ)V | |
*/ | |
JNIEXPORT void JNICALL Java_LinuxIoAdapter_seek | |
(JNIEnv *env, jclass jc, jint handle, jlong pos) | |
{ | |
| |
lseek(handle, pos, SEEK_SET); | |
} | |
| |
/* | |
* Class: LinuxIoAdapter | |
* Method: sync | |
* Signature: (J)V | |
*/ | |
JNIEXPORT void JNICALL Java_LinuxIoAdapter_sync | |
(JNIEnv *env, jclass jc, jint handle) | |
{ | |
fdatasync(handle); | |
//fsync(handle); | |
} | |
| |
/* | |
* Class: LinuxIoAdapter | |
* Method: write | |
* Signature: (J[BI)V | |
*/ | |
JNIEXPORT void JNICALL Java_LinuxIoAdapter_write | |
(JNIEnv *env, jclass jc, jint handle, jbyteArray bytes, jint length) | |
{ | |
jbyte buf[length]; | |
| |
(*env)->GetByteArrayRegion(env, bytes, 0, length, buf); | |
| |
ssize_t n = write(handle, &buf, length); | |
| |
if(n != length) | |
printf("read failed: %i->%i", length, n); | |
} |
fdatasync() makes db4o 10-20% faster
By MD on Mar 3, 2008 | In db4o | Send feedback »
I have investigated how much fdatasync() makes db4o faster on Linux. It showed 10-20% faster performance in commit.
This benchmarking was done on Linux2.6, write cache off, ext3 file system, with db4o 6.4.
fdatasync() adapter
---------- 3: insert ----------
Used Memory: 273600/5177344
total commit cost: 33816 ms
average commit cost: 135 ms
Execution Time: 36625ms
Used Memory: 530904/5177344
---------- END -------------------- 9: update ----------
Used Memory: 289840/5177344
total commit cost: 38886 ms
average commit cost: 155 ms
Execution Time: 45488ms
Used Memory: 348824/5177344
---------- END -------------------- 11: delete ----------
Used Memory: 349088/5177344
total commit cost: 49090 ms
average commit cost: 196 ms
Execution Time: 52867ms
Used Memory: 576280/5177344
---------- END ----------fsync adapter(the default one on Java)
---------- 3: insert ----------
Used Memory: 354584/5177344
total commit cost: 38190 ms
average commit cost: 152 ms
Execution Time: 41294ms
Used Memory: 659360/5177344
---------- END -------------------- 9: update ----------
Used Memory: 372136/5177344
total commit cost: 45015 ms
average commit cost: 180 ms
Execution Time: 50755ms
Used Memory: 428448/5177344
---------- END -------------------- 11: delete ----------
Used Memory: 428712/5177344
total commit cost: 49602 ms
average commit cost: 198 ms
Execution Time: 52775ms
Used Memory: 651192/5177344
---------- END ----------
The performance improvements depend on, how frequently user data is committed, and the implementation of fdatasync call. But even in the worst case, it is as fast as the default one.
BTW, fdatasync have triggered an inconsistent state in fread&fwrite. As people suggest, I should have avoided to mix POSIX(direct) and Standard C(buffered).
Another skeptical guy
By MD on Mar 1, 2008 | In Durability | Send feedback »
Proposal for "proper" durable fsync() and fdatasync()
Jamie made a proposal in linux kernel newsgroup.
As I wrote about it in my latest column, it is getting hot. If you are serious about durability, let's make some noise together!
Lying Computer
By MD on Feb 28, 2008 | In Information, Durability | Send feedback »
My latest column, titled "Lying Computer" was up today.
It's about "sync", which flushes buffers in memory and write to disk physically, then synchronized the state in memory with one on file. It is not done by one single component, but done in relay. It should work as expected, but won't often today. So, who's telling a lie?
A couple of years ago, not sure, but there certainly were some hot topics about disk write back cache.
Linux: Should disk write cache be disabled for any journalised filesystem?
Windows: Manually Enable/Disable Disk Write Caching
In the Windows manual, the latter one, you'll see a clear description.
By enabling write caching, file system corruption and/or data loss could occur if the machine experiences a power, device or system failure and cannot be shutdown properly.
So, it is natural to disable the write cache by default, from the database vendor's point of view.
But today, disk write back cache is enabled by default. Often, you can not even get a way to disable it. This is for the sake of performance because of the Coelacanth, HDD, in computer.
What makes it worth is an emergence of some kind of virtualization like Java and .NET.
sync
public void sync() throws SyncFailedExceptionForce
all system buffers to synchronize with the underlying device. This method returns after all modified data and attributes of this FileDescriptor have been written to the relevant device(s). In particular, if this FileDescriptor refers to a physical storage medium, such as a file in a file system, sync will not return until all in-memory modified copies of buffers associated with this FileDesecriptor have been written to the physical medium. sync is meant to be used by code that requires physical storage (such as a file) to be in a known state For example, a class that provided a simple transaction facility might use sync to ensure that all changes to a file caused by a given transaction were recorded on a storage medium. sync only affects buffers downstream of this FileDescriptor. If any in-memory buffering is being done by the application (for example, by a BufferedOutputStream object), those buffers must be flushed into the FileDescriptor (for example, by invoking OutputStream.flush) before that data will be affected by sync.Throws:
SyncFailedException - Thrown when the buffers cannot be flushed, or because the system cannot guarantee that all the buffers have been synchronized with physical media.
Since:
JDK1.1
.NET: FileStream.Flush not flushing?
FileStream.Flush Method
Clears all buffers for this stream and causes any buffered data to be written to the file system.
Stream.Flush Method
When overridden in a derived class, clears all buffers for this stream and causes any buffered data to be written to the underlying device.
Microsoft Windows CE .NET 4.2
FlushFileBuffers
This function clears the buffers for the specified file and causes all buffered data to be written to the file.Copy CodeBOOL WINAPI FlushFileBuffers(
HANDLE hFile
); Parameters
hFile
[in] Handle to an open file. The function flushes this file's buffers. The file handle must have GENERIC_WRITE access to the file.
If hFile is a handle to a communications device, the function only flushes the transmit buffer.Return Values
Nonzero indicates success. Zero indicates failure. To get extended error information, call GetLastError.Remarks
The WriteFile function typically writes data to an internal buffer that the OS writes to disk on a regular basis. The FlushFileBuffers function writes the buffered information for the specified file to disk.Requirements
OS Versions: Windows CE 1.0 and later.
Header: Winbase.h.
Link Library: Coredll.lib.
Then, unless you are aware of hardware and its underlying architectures, innocent Java/.NET guys would be misled.
I will check some major langurages/OSes how they describe its sync behavior, and then ask to fix or put a warning if necessary.
The next question is what is the best workaround for you. I will write it in my next column.
"Interface" in C
By MD on Feb 27, 2008 | In Ease Of Use | Send feedback »
Recently, I began to study C programming language. I used to be a Java guy, but have felt like doing that to contribute to any serious device projects here in Japan.
Java is new, object oriented language. But C is old, not object oriented.
What made me surprised was that I couldn't find a way to cut a dependency. I can define a template, but they are exposed in terms of "coupling" object oriented concept.
Before diving into C world, to avoid such a "spagetti", I would like to get a gut feeling about dependency management.
Then, I have investigated Linux VFS(Virtual File System) design. A file system in Linux is well abstracted, so must have a good design in C.
"The VFS is object-oriented. A family of data structures represents the common file model. There data structures are akin to objects. Because the kernel is programmed strictly in C, without the benefit of a language directly supporting object-oriented paradigms, the data structures are represented as C structures. The structures contain both data and pointers to filesystem-implemented functions that operate on the data." - page212, Linux Kernel Development Second Edition, Robert Love
Ah-ha, a strucrue could be used like an interface... OK, let's nail down the source. Here's a code snippet from linux/fs.c.
struct super_block {
...
struct super_operations *s_op;
...
};
Methods look defined as an interface. So what about super_operations?
struct super_operations {
struct inode *(*alloc_inode)(struct super_block *sb);
void (*destroy_inode)(struct inode *);void (*read_inode) (struct inode *);
void (*dirty_inode) (struct inode *);
int (*write_inode) (struct inode *, int);
void (*put_inode) (struct inode *);
void (*drop_inode) (struct inode *);
void (*delete_inode) (struct inode *);
void (*put_super) (struct super_block *);
void (*write_super) (struct super_block *);
int (*sync_fs)(struct super_block *sb, int wait);
void (*write_super_lockfs) (struct super_block *);
void (*unlockfs) (struct super_block *);
int (*statfs) (struct super_block *, struct kstatfs *);
int (*remount_fs) (struct super_block *, int *, char *);
void (*clear_inode) (struct inode *);
void (*umount_begin) (struct super_block *);...
};
Mmmm, something looks strange. I wonder why they have an argument of its body, "struct super_block *" or "struct inode *".
Yes, because there is no such a thing like "instance" in C. So you can not access to an instance inside a method body, like "this" in Java. This is a bit strange, but still OK.
Yet wait, passing itself to methods mean that an interface(operations struct) depends on its concrete type! That's too bad... Its implementation can be abstracted, but not like an interface to cut dependencies.
How can I put an abstraction with an interface to cut dependencies?
I found a way to do that with a generic pointer. I am trying to test "List" interface Java design in C. Performance may matter, but still look promissing. Let's see...