Author: Ivan Prenosil. Original
text
This is just for fun - no prizes are offered (except of publishing
your name in big font if you
send me
correct answers :-)
I received lot of answers, but none correct so far :-(
How would you change character set of table's column
(already populated with data containig national characters),
without copying data to different column/table/database/file ? (i.e. do it "in place")
Details:
Sometimes people create new database with character set NONE, populate it with data,
and then decide to change character set of some columns to something more suitable,
like ISO8859_1.
The problems are
- There is no ALTER command to achieve it directly in SQL.
- When you change character set by modifying system tables, IB/FB will not touch the data;
instead, it will create new format for each modified table
(as with other changes to table structure, like adding new columns etc.).
This is o.k. if the table is empty, but can cause big problems if it is already
populated with data, because
- It is not possible to directly convert string in NONE character set into string
that is e.g. ISO8859_1 (unless it contains only ASCII characters);
otherwise you get the famous "Cannot transliterate character between character sets" error.
Thus, when just changing character set in system tables old data become unreadable!
(the error will occur during read, when IB/FB tries to convert old data to new format).
When update conflict occur in WAIT transaction, it goes into wait state.
What is it that this transaction waits for ?
Does it wait for offending record (to be "unlocked"),
or for the other transaction (to be committed/rollbacked) ?
Is there any difference at all ?
Suppose we have database with no tables, external tables, UDFs, ...
There is only one procedure with one VARCHAR output parameter:
CREATE PROCEDURE proc
RETURNS (str VARCHAR(10)) AS
BEGIN
...
END
When you execute this procedure, it will return string
consisting of 5 ASCII NUL characters (binary zeroes).
What is in the body of SP ?
SOLUTION
Suppose we have database with no tables, external tables, UDFs, ...
There is only one procedure with one VARCHAR output parameter.
When you execute this procedure, it will return string
consisting of 5 ASCII NUL characters (binary zeroes).
What is in the body of this SP ?
When assigning shorter string to longer CHAR field
(or to CHAR variable, or CASTing it to CHAR),
IB/FB pads it with "blanks". The character used for padding is not
always the space; in fact, it depends on character set used.
The character set OCTETS is intended to store binary data,
and so its padding character is - binary zero.
CREATE PROCEDURE proc
RETURNS (str VARCHAR(10)) AS
BEGIN
str = CAST('' AS CHAR(5) CHARACTER SET OCTETS);
END
Suppose you can choose between these two select statements
(they will give you the same result sets):
(A)
SELECT *
FROM tab
WHERE x='YES' OR x='NO' OR x='PERHAPS' OR x='MAYBE';
(B)
SELECT *
FROM tab
WHERE x IN ('YES','NO','PERHAPS','MAYBE');
What would you do if you are concerned about speed ?
- Choose (A) because it is always faster than (B)
- Choose (A) because it is equally fast or faster than (B)
- Choose (B) because it is equally fast or faster than (A)
- Choose (B) because it is always faster than (A)
- Test both because IB's optimizer is unpredictable
- Pick arbitrary one because they are equivalent
- Something else ...
(Some kind of evidence is better than just guessing)
SOLUTION
Which one is faster ?
SELECT *
FROM tab
WHERE x='YES' OR x='NO' OR x='PERHAPS' OR x='MAYBE';
or
SELECT *
FROM tab
WHERE x IN ('YES','NO','PERHAPS','MAYBE');
SELECT commands are not "thrown" to optimizer directly as string,
but are first compiled into BLR (Binary Language Representation).
BLR representations of the above commands are identical, and so the optimizer
has no chance to distinguish them. Thus the answer shoud be
- Pick arbitrary one because they are equivalent
How to look at BLR ? There are many ways, e.g. using QLI tool (use SET BLR command),
looking at output generated by GPRE, or using these SELECT commands in SP, trigger or view
and looking at their compiled versions into system tables
(RDB$PROCEDURE_BLR field in RDB$PROCEDURES table etc.).
Note: To be more precise, IB5 generated identical BLRs. Due to changes in IB6, parts of generated BLR code
can be slightly reordered, which should not however affect the optimizer.
These two clauses do generate identical BLR in IB6/FB:
"WHERE ((x='YES' OR x='NO') OR x='PERHAPS') OR x='MAYBE'"
"WHERE x IN ('YES','NO','PERHAPS','MAYBE')"
Ivan Prenosil (2002)