PostgreSQL修炼之道：从小工到专家（第2版）最新章节_唐成著

5.12　xml类型

xml类型可用于存储XML数据。使用字符串类型（如text类型）也可以存储XML数据，但text类型不能保证其中存储的是合法的XML数据，通常需要由应用程序来负责保证所输入数据的正确性，这会增加应用程序的开发难度，而使用xml类型就不存在此类问题，数据库会对输入的数据进行检查，使不符合XML标准的数据不能存放到数据库中，同时还提供了函数对其类型进行安全性检查。

注意，要使用xml数据类型，在编译PostgreSQL源码时必须使用以下参数：

configure --with-libxml。

5.12.1　xml类型的输入

与其他类型类似，可以使用下面两种语法来输入xml类型的数据：

xml '<osdba>hello world</osdba>'
'<osdba>hello world</osdba>'::xml

示例如下：

osdba=# select xml '<osdba>hello world</osdba>';
            xml             
----------------------------
  <osdba>hello world</osdba>
(1 row)

osdba=# select '<osdba>hello world</osdba>'::xml;
            xml             
----------------------------
  <osdba>hello world</osdba>
(1 row)

xml中存储的XML数据有以下两种：

·由XML标准定义的documents。

·由XML标准定义的content片段。

content片段可以有多个顶级元素或character节点。但documents只能有一个顶级元素。可以使用“xmlvalue IS DOCUMENT”来判断一个特定的XML值是一个documents还是content片段。

PostgreSQL的xmloptions参数用来指定输入的数据是documents还是content片段，默认情况下此值为content片段，所以输入的xml可以有多个顶级元素，但如果我们把此参数设置成“document”，将不能输入有多个顶级元素的内容，示例如下：

osdba=# show xmloption;
  xmloption 
-----------
  content
(1 row)

osdba=# select xml '<a>a</a><b>b</b>';
        xml        
------------------
  <a>a</a><b>b</b>
(1 row)

osdba=# SET xmloption TO document;
SET
osdba=# select xml '<a>a</a><b>b</b>';
ERROR:  invalid XML document
LINE 1: select xml '<a>a</a><b>b</b>';
                   ^
DETAIL:  line 1: Extra content at the end of the document
<a>a</a><b>b</b>
        ^

也可以通过函数xmlparse由字符串数据来产生xml类型的数据，使用xmlparse函数是SQL标准中将字串转换成xml的唯一方法。

函数xmlparse的语法如下：

XMLPARSE ( { DOCUMENT | CONTENT } value)

此函数中的参数“DOCUMENT”和“CONTENT”表示指定XML数据的类型。

示例如下：

osdba=# SELECT xmlparse (document '<?xml version="1.0"?><person><name>John</name><sex>F</sex></person>');
                    xmlparse                    
------------------------------------------------
  <person><name>John</name><sex>F</sex></person>
(1 row)

osdba=# SELECT xmlparse (content '<person><name>John</name><sex>F</sex></person>');
                    xmlparse                    
------------------------------------------------
  <person><name>John</name><sex>F</sex></person>
(1 row)

5.12.2　字符集的问题

PostgreSQL数据库在客户端与服务器之间传递数据时，会自动进行字符集的转换。如果客户端的字符集与服务端不一致，PostgreSQL会自动进行字符集转换。但也正是因为这一特性，用户在传递XML数据时需要格外注意。我们知道，对于XML文件来说，可以通过类似“encoding="XXX"”的方式指定自己文件的字符集，但当这些数据在PostgreSQL之间传递时，PostgreSQL会把其原始内容的字符集变成数据库服务端的字符集，这会导致一系列问题，因为这意味着XML数据中的字符集编码声明在客户端和服务器之间传递时，可能变得无效。为了应对该问题，提交输入到xml类型的字符串中的编码声明将会被忽略，同时内容的字符集会被认为是当前数据库服务器的字符集。

正确处理XML字符集的方式是，将XML数据的字符串在当前客户端中编码成当前客户端的字符集，在发送到服务端后，再转换成服务端的字符集进行存储。当查询xml类型的值时，此数据又会被转换成客户端的字符集，所以客户端收到的XML数据的字符集就是客户端的字符集。

所以通常来说，如果XML数据的字符集编码、客户端字符集编码以及服务器字符集编码完全一致，那么用PostgreSQL来处理XML数据将会大大减少字符集问题，并且处理效率也会很高。通常XML数据都是以UTF-8编码格式进行处理的，因此把PostgreSQL数据库服务器端编码也设置成UTF-8将是一种不错的选择。

5.12.3　xml类型函数

下面先介绍一些生成xml内容的函数，这些函数见表5-29。

表5-29　生成XML内容的函数

PostgreSQL中提供了xpath函数来计算XPath1.0表达式的结果。XPath是W3C的一个标准，它最主要的目的是在XML1.0或XML1.1文档节点树中定位节点。目前有XPath1.0和XPath2.0两个版本。其中XPath1.0是1999年成为W3C标准的，而XPath2.0标准的确立是在2007年，目前PostgreSQL只支持XPath1.0的表达式。

xpath函数的定义如下：

xpath(xpath, xml[, nsarray])

此函数的第一个参数是一个XPath1.0表达式，第二个参数是一个xml类型的值，第三个参数是一个命名空间的数组映射。该数组应该是一个二维数组，第二维的长度等于2，即包含两个元素。

示例如下：

SELECT xpath('/my:a/text()', '<my:a xmlns:my="http://example.com">test</my:a>', 
              ARRAY[ARRAY['my','http://example.com']]);

  xpath  
--------
  {test}
(1 row)

处理默认命名空间的访问示例如下：

SELECT xpath('//mydefns:b/text()', '<a xmlns="http://example.com"><b>test</b></a>', 
              ARRAY[ARRAY['mydefns', 'http://example.com']]);

  xpath
--------
  {test}
(1 row)

PostgreSQL还提供了把数据库中的内容导出成XML数据的以下函数：

·table_to_xmlschema(tbl regclass,nulls boolean,tableforest boolean,targetns text)。

·query_to_xmlschema(query text,nulls boolean,tableforest boolean,targetns text)。

·cursor_to_xmlschema(cursor refcursor,nulls boolean,tableforest boolean,targetns text)。

·table_to_xml_and_xmlschema(tbl regclass，nulls boolean，tableforest boolean，targetns text)。

·query_to_xml_and_xmlschema(query text，nulls boolean，tableforest boolean，targetns text)。

·schema_to_xml(schema name，nulls boolean，tableforest boolean，targetns text)。

·schema_to_xmlschema(schema name，nulls boolean，tableforest boolean，targetns text)。

·schema_to_xml_and_xmlschema(schema name，nulls boolean，tableforest boolean，targetns text)。

·database_to_xml(nulls boolean，tableforest boolean，targetns text)。

·database_to_xmlschema(nulls boolean，tableforest boolean，targetns text)。

·database_to_xml_and_xmlschema(nulls boolean，tableforest boolean，targetns text)。

下面来举例说明这几个函数的使用方法。

首先创建一张测试表：

CREATE TABLE test01(id int, note text);
INSERT INTO test01 select seq, repeat(seq::text, 2) from generate_series(1,5) as t(seq);

下面来看其中第一个函数table_to_xmlschema的使用方法：

osdba=# select table_to_xmlschema('test01'::regclass, true, true, 'tangspace');
                                table_to_xmlschema                               
--------------------------------------------------------------------------------
  <xsd:schema                                                                  +
      xmlns:xsd="http://www.w3.org/2001/XMLSchema"                             +
      targetNamespace="tangspace"                                              +
      elementFormDefault="qualified">                                           +
                                                                               +
  <xsd:simpleType name="INTEGER">                                              +
    <xsd:restriction base="xsd:int">                                           +
      <xsd:maxInclusive value="2147483647"/>                                   +
      <xsd:minInclusive value="-2147483648"/>                                  +
    </xsd:restriction>                                                         +
  </xsd:simpleType>                                                            +
                                                                               +
  <xsd:simpleType name="UDT.osdba.pg_catalog.text">                            +
    <xsd:restriction base="xsd:string">                                        +
    </xsd:restriction>                                                         +
  </xsd:simpleType>                                                            +
                                                                               +
  <xsd:complexType name="RowType.osdba.public.test01">                         +
    <xsd:sequence>                                                             +
      <xsd:element name="id" type="INTEGER" nillable="true"></xsd:element>     +
      <xsd:element name="note" type="UDT.osdba.pg_catalog.text" nillable="true">.
.</xsd:element>                                                                +
    </xsd:sequence>                                                            +
  </xsd:complexType>                                                           +
                                                                               +
  <xsd:element name="test01" type="RowType.osdba.public.test01"/>              +
                                                                               +
  </xsd:schema>
(1 row)

从上面的结果中可以看出，此函数把表的定义转换成了xml格式。

再来看第二个函数query_to_xmlschema的使用方法：

osdba=# select query_to_xmlschema('SELECT * FROM test01', true, true, 'tangspace');
                               query_to_xmlschema                               
--------------------------------------------------------------------------------
  <xsd:schema                                                                  +
      xmlns:xsd="http://www.w3.org/2001/XMLSchema"                             +
      targetNamespace="tangspace"                                              +
      elementFormDefault="qualified">                                           +
                                                                               +
  <xsd:simpleType name="INTEGER">                                              +
    <xsd:restriction base="xsd:int">                                           +
      <xsd:maxInclusive value="2147483647"/>                                   +
      <xsd:minInclusive value="-2147483648"/>                                  +
    </xsd:restriction>                                                         +
  </xsd:simpleType>                                                            +
                                                                               +
  <xsd:simpleType name="UDT.osdba.pg_catalog.text">                            +
    <xsd:restriction base="xsd:string">                                        +
    </xsd:restriction>                                                         +
  </xsd:simpleType>                                                            +
                                                                               +
  <xsd:complexType name="RowType">                                             +
    <xsd:sequence>                                                             +
      <xsd:element name="id" type="INTEGER" nillable="true"></xsd:element>     +
      <xsd:element name="note" type="UDT.osdba.pg_catalog.text" nillable="true">.
.</xsd:element>                                                                +
    </xsd:sequence>                                                            +
  </xsd:complexType>                                                           +
                                                                               +
  <xsd:element name="row" type="RowType"/>                                     +
                                                                               +
  </xsd:schema>
(1 row)

从上面的结果中可以看到query_to_xmlschema把查询结果中行的定义转成了xml格式。

cursor_to_xmlschema函数与前两个函数的意义相同，这里不再赘述。

我们再来看看函数table_to_xml_and_xmlschema和query_to_xml_and_xmlschema的使用方法：

osdba=# select table_to_xml_and_xmlschema('test01'::regclass, true, true, 'tangspace');
                            table_to_xml_and_xmlschema                           
--------------------------------------------------------------------------------
  <xsd:schema                                                                  +
      xmlns:xsd="http://www.w3.org/2001/XMLSchema"                             +
      targetNamespace="tangspace"                                              +
      elementFormDefault="qualified">                                           +
                                                                               +
  <xsd:simpleType name="INTEGER">                                              +
    <xsd:restriction base="xsd:int">                                           +
      <xsd:maxInclusive value="2147483647"/>                                   +
      <xsd:minInclusive value="-2147483648"/>                                  +
    </xsd:restriction>                                                         +
  </xsd:simpleType>                                                            +
                                                                               +
  <xsd:simpleType name="UDT.osdba.pg_catalog.text">                            +
    <xsd:restriction base="xsd:string">                                        +
    </xsd:restriction>                                                         +
  </xsd:simpleType>                                                            +
                                                                               +
  <xsd:complexType name="RowType.osdba.public.test01">                         +
    <xsd:sequence>                                                             +
      <xsd:element name="id" type="INTEGER" nillable="true"></xsd:element>     +
      <xsd:element name="note" type="UDT.osdba.pg_catalog.text" nillable="true">.
.</xsd:element>                                                                +
    </xsd:sequence>                                                            +
  </xsd:complexType>                                                           +
                                                                               +
  <xsd:element name="test01" type="RowType.osdba.public.test01"/>              +
                                                                               +
  </xsd:schema>                                                                +
                                                                               +
  <test01 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="tangspace.
.">                                                                            +
    <id>1</id>                                                                 +
    <note>11</note>                                                            +
  </test01>                                                                    +
                                                                               +
  <test01 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="tangspace.
.">                                                                            +
    <id>2</id>                                                                 +
    <note>22</note>                                                            +
  </test01>                                                                    +
                                                                               +
  <test01 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="tangspace.
.">                                                                            +
    <id>3</id>                                                                 +
    <note>33</note>                                                            +
  </test01>                                                                    +
                                                                               +
  <test01 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="tangspace.
.">                                                                            +
    <id>4</id>                                                                 +
    <note>44</note>                                                            +
  </test01>                                                                    +
                                                                               +
  <test01 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="tangspace.
.">                                                                            +
    <id>5</id>                                                                 +
    <note>55</note>                                                            +
  </test01>                                                                    +
                                                                               +
 
(1 row)

从上面的运行结果中可以看出，table_to_xml_and_xmlschema函数与table_to_xmlschema函数的唯一不同之处在于把表中的数据也导出到xml中了。query_to_xml_and_xmlschema函数与table_to_xml_and_xmlschema的差异这里不再赘述。

下面来看看函数schema_to_xml、schema_to_xmlschema和schema_to_xml_and_xmlschema的使用方法：

osdba=# select schema_to_xml('public', true, true, 'tangnamespace');
                                 schema_to_xml                                  
--------------------------------------------------------------------------------
  <public xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="tangnames.
.pace">                                                                        +
                                                                               +
  <test01>                                                                     +
    <id>1</id>                                                                 +
    <note>11</note>                                                            +
  </test01>                                                                    +
                                                                               +
  <test01>                                                                     +
    <id>2</id>                                                                 +
    <note>22</note>                                                            +
  </test01>                                                                    +
                                                                               +
  <test01>                                                                     +
    <id>3</id>                                                                 +
    <note>33</note>                                                            +
  </test01>                                                                    +
                                                                               +
  <test01>                                                                     +
    <id>4</id>                                                                 +
    <note>44</note>                                                            +
  </test01>                                                                    +
                                                                               +
  <test01>                                                                     +
    <id>5</id>                                                                 +
    <note>55</note>                                                            +
  </test01>                                                                    +
                                                                               +
                                                                               +
  <test02>                                                                     +
    <id>1</id>                                                                 +
    <note>11</note>                                                            +
  </test02>                                                                    +
                                                                               +
  <test02>                                                                     +
    <id>2</id>                                                                 +
    <note>22</note>                                                            +
  </test02>                                                                    +
                                                                               +
                                                                               +
  </public>                                                                    +
 
(1 row)

从上面的运行结果中可以看出，schema_to_xml把一个schema中的数据全部导成了xml格式。

可以想象，schema_to_xmlschema只是导出了schema的定义，而schema_to_xml_and_xmlschema则是导出了全部定义及数据。

另外3个函数database_to_xml、database_to_xmlschema、database_to_xml_and_xmlschema只针对某个数据库对象，这里不再赘述。

5.12 xml类型

5.12.1 xml类型的输入

5.12.2 字符集的问题

5.12.3 xml类型函数

5.12　xml类型

5.12.1　xml类型的输入

5.12.2　字符集的问题

5.12.3　xml类型函数