西西河

主题:请推荐比较几个database的review,introduction文章吧 -- ppw

共:💬7 新:
全看树展主题 · 分页首页 上页
/ 1
下页 末页
家园 请推荐比较几个database的review,introduction文章吧

主要是Berkely DB, Postgresql, firebird, MaxDB. mysql.interbase,

performance 和特点是什么?

http://www.geocities.com/mailsoftware42/db/

http://www.linux-mag.com/2002-06/database_01.html

http://www.phpbuilder.com/columns/tim20000705.php3

http://www.linuxplanet.com/linuxplanet/tutorials/1251/9/

below is a must read :)

http://openacs.org/philosophy/why-not-mysql.html

家园 god I love openacs!!

it says everything about java I want to say,

http://openacs.org/philosophy/openacs-and-java

家园 a disscussion

the fact that the basics of database design are completely forgotten about (I.e. building views) as well as many other features, such as stored procedures with pre-calced query plans, and query optimization etc. just go to show that many database are simply being misused due to a lack of experience. I know many of you guys think you know about databases and queries, but I'm afraid it's a fact that all large corporates have expensive DBA teams, simply because programmers generally do not have the experience to build databases. Remember that these types of databases were processing multi-million row tables on very archaic machines, many, many years ago, when a mini then, had the power of a 486 desktop.

The complex features of many databases are to allow total flexibility in getting data together quickly. The art of optimization still continues today, and if you are serious about learning some of it for your personal development as a techy, then learn on a real database, not on MySQL. Download PG, Oracle or Sybase and get started with a good book. I thoroughly recommend some hard and fast learning on database design & build, you'd be amazed at the things you avoid doing because of programming folklore about databases. i.e. as Tim mentions, sub-selects save incredible amounts of time and effort in programming and execution time, but only if they are properly optimized, otherwise they can be fatal. I've had numerous programmers try to tell me that sub-selects should not be used at all, probably because a DBA has told them the same thing in the past so his job is easier.

I suspect that PG gets bad press because of it's hangovers from it's origins, sometimes it feels very olde worlde. I was fortunate enough that a company paid for my DBA training on Ingres back in 1990, so using PG was quite easy for me, I admit.

I find it quite amazing that people quote MySQL as being easy to use, but this beauty is only skin deep, there are many annoying things about MySQL ... perhaps it's easy to get going, but it gets progressively worse as you dig deeper. This is of course due to the fact that MySQL is a very non-standard, so it becomes more and more proprietary as you learn more about it, which after spending over 10yrs as a dba, I guess I have low tolerence for. If you are happy to use it and spend time to learn it to avoid its quirks, then you have to size up the investment you are making ... would you be better learning on a more standard environment?

So in my biased opinion, PG is the clear winner, that initial MySQL ped doing because of programming folklore about databases. i.e. as Tim mentions, sub-selects save incredible amounts of time and effort in programming and execution time, but only if they are properly optimized, otherwise they can be fatal. I've had numerous programmers try to tell me that sub-selects should not be used at all, probably because a DBA has told them the same thing in the past so his job is easier.

I suspect that PG gets bad press because of it's hangovers from it's origins, sometimes it feels very olde worlde. I was fortunate enough that a company paid for my DBA training on Ingres back in 1990, so using PG was quite easy for me, I admit.

I find it quite amazing that people quote MySQL as being easy to use, but this beauty is only skin deep, there are many annoying things about MySQL ... perhaps it's easy to get going, but it gets progressively worse as you dig deeper. This is of course due to the fact that MySQL is a very non-standard, so it becomes more and more proprietary as you learn more about it, which after spending over 10yrs as a dba, I guess I have low tolerence for. If you are happy to use it and spend time to learn it to avoid its quirks, then you have to size up the investment you are making ... would you be better learning on a more standard environment?

So in my biased opinion, PG is the clear winner, that initial MySQL performance is se of programming folklore about databases. i.e. as Tim mentions, sub-selects save incredible amounts of time and effort in programming and execution time, but only if they are properly optimized, otherwise they can be fatal. I've had numerous programmers try to tell me that sub-selects should not be used at all, probably because a DBA has told them the same thing in the past so his job is easier.

I suspect that PG gets bad press because of it's hangovers from it's origins, sometimes it feels very olde worlde. I was fortunate enough that a company paid for my DBA training on Ingres back in 1990, so using PG was quite easy for me, I admit.

I find it quite amazing that people quote MySQL as being easy to use, but this beauty is only skin deep, there are many annoying things about MySQL ... perhaps it's easy to get going, but it gets progressively worse as you dig deeper. This is of course due to the fact that MySQL is a very non-standard, so it becomes more and more proprietary as you learn more about it, which after spending over 10yrs as a dba, I guess I have low tolerence for. If you are happy to use it and spend time to learn it to avoid its quirks, then you have to size up the investment you are making ... would you be better learning on a more standard environment?

So in my biased opinion, PG is the clear winner, that initial MySQL performance is se of programming folklore about databases. i.e. as Tim mentions, sub-selects save incredible amounts of time and effort in programming and execution time, but only if they are properly optimized, otherwise they can be fatal. I've had numerous programmers try to tell me that sub-selects should not be used at all, probably because a DBA has told them the same thing in the past so his job is easier.

I suspect that PG gets bad press because of it's hangovers from it's origins, sometimes it feels very olde worlde. I was fortunate enough that a company paid for my DBA training on Ingres back in 1990, so using PG was quite easy for me, I admit.

I find it quite amazing that people quote MySQL as being easy to use, but this beauty is only skin deep, there are many annoying things about MySQL ... perhaps it's easy to get going, but it gets progressively worse as you dig deeper. This is of course due to the fact that MySQL is a very non-standard, so it becomes more and more proprietary as you learn more about it, which after spending over 10yrs as a dba, I guess I have low tolerence for. If you are happy to use it and spend time to learn it to avoid its quirks, then you have to size up the investment you are making ... would you be better learning on a more standard environment?

So in my biased opinion, PG is the clear winner, that initial MySQL performance is handy (and easy) but there is a big price for it, which is normally paid for later on.

One last peice of advice for programmers, is learn your craft, spend as much time learning about your database as you did learning about your favorite programming language and you will find that databases are actually massively impressive peices of kit to have in your artillery, understand keys and optimization data, understand relationships, realize that normalization is the right path, not the best path, which often involves denormalization. Lastly, getting it right can save huge amounts of development, or rather getting it wrong leads to huge amounts of extra development. Many concepts of OO were built upon good foundations in Relational theory. Stored procedures and triggers (events) are early attempts at encapsulation.

I actually find myself agreeing with Gresh. A lot of the PHP developers I've talked to both online and in real life are self-taught, and seem to not bother so much when they are learning about databases.

They learn how to do 3 queries, and they only learn one way to do each of them (generally select, insert, and delete.)

I've met a LOT of PHP developers that do completely insane things with databases, like creating an autonumber field EVERY time they need a primary key. Guys, if you need a primary key, the best solution is to use a piece of data that has an actual literal existance. If you make a table of message boards, and they are all going to be named diffently...use the NAME as the primary key.

Learning as much as you can about databases will save you loads of time and confusion when it comes time to write the PHP data access tier of your application.

Do not spend a day on mysql and decide that you now know all you need to know about database design.

he author of this message on 12/09/01 14:04 is probably long gone, however, I wanted to address his comments. I generally agree with his outlook that developers need to learn relational database development to begin to begin to take advantage of the power of a RDBMS, however I just couldn't resist addressing his example,

"I've met a LOT of PHP developers that do completely insane things with databases, like creating an autonumber field EVERY time they need a primary key. Guys, if you need a primary key, the best solution is to use a piece of data that has an actual literal existance. If you make a table of message boards, and they are all going to be named diffently...use the NAME as the primary key. "

and tell you its dead wrong, and he could probably use a course or two in relational database design himself, because that example defies the whole point of relational database design. All you beginners that are using an auto increment as your primary key, keep right on doing it. The whole idea of relational database development is to organize your data so that it is concise and not repeated. You should avoid using actual intelligent data as a primary key, because a primary key in one table is likely a secondary key in one or more tables elsewhere (that's the point of relational databases), and therefore, would need to repeat that data in several places. If you ever needed to update that data, you'd have to do it in every table containing it and or the meaning of the identifier changes, whereas an alpha-numeric ID reference as a primary key always stays the same, and so any changes in data are made in only one table. The very statement that 'if they are all going to be named differently...' paints you into a corner automatically, you never know what the future will bring. You'll never, ever see a relational database textbook with an example of 'name' as a primary key. It will be something like 'person_id' and then 'name' may be a unique secondary key beside it. In fact you'll see some database experts go the other way, and define an 'enterprise key' which will add two unique keys that are simply numeric IDs to each table, one to uniquely indentify the data in each table, one to unique identify each row in the entire database, making it much more object oriented.

So in beginning to learn database design, find out what the 3 normal forms are and always use references to rows of data as primary keys (IDs) not data itself.

to MIKE: while it is clear that you understand the theory of a RDBMS, it is also VERY clear that you have NO IDEA of the mechanics of most RDBMS's out there today. i work with both SQL Server and MySQL so I can tell you that I know design and performance metrics of both these systems pretty well, and the fellow you are putting down is more correct than you.

what you say about keeping data normalized is true, and in fact absolutely mandatory. but the first post is also correct in that it is better to use keys from other tables as keys in your tables. the value the foreign key should be the ID of the value of the primary key in another. in a well-designed database, this will faciliate far better performance in that you will be searching a frequently searched field that has a *clustered* index on it. this clustered index makes all the difference between the two.. you always want the fields in your WHERE clause to specify an indexed field, but having the clustered index on a field that can be used to identify the records (or set of records) will remove the step of an additional join to get the value of the ID field if the foreign table that you are talking about. this plus the clustered index as opposed to the non-clustered index make a significant performance gain. cheers

家园 Looks like this guy has been falling

in love with PG. Good for him.

Never tried PG before, so no comments here! But I just checked 外链出处 and didn't find PostgresSQL at all.

家园 I cannot search mysql

out either, looks like most of data only for

commercial database, like SQL2000, ORACLE,

informatrix. the discussion is mainly about open source

database.

家园 Hopefully those open source database

can catch up one day. Then we will have far more choices.

家园 hehe

I am trying to test postsql soon,

my database need delivery 250,000 request every day

and has more the 10,000 photos , would like to see how

postgresql performed.

but I am the person as that guru mentioned , a amateur

programmer who know insert, delete, update and

and then think he know database :p.

a lot to learn, but no time. so I will just go the most

uneffcient way, select * from where... hehe

that could be another standard of how well a database could

perfrom by a non-brainner.

全看树展主题 · 分页首页 上页
/ 1
下页 末页


有趣有益,互惠互利;开阔视野,博采众长。
虚拟的网络,真实的人。天南地北客,相逢皆朋友

Copyright © cchere 西西河