C# XML文件操作总结 SAX DOM 解析
去年写过一篇关于C# XML文件操作的两种方式,但是没有给出样例代码,这里再次给出两种操作XML文件的方式,并且包含部分样例代码。
以下两种XML操作方式,不仅只是适合C# 语言进行开发,其实语言也都适用。
原文链接: 点击打开链接 http://blog.csdn.net/weixingstudio/article/details/7026712
XML的解析器主要有DOM和SAX两种。
先简要介绍SAX解析器(Simple API for XML),其实解析器就是对XML进行处理的一套API,相当于一个模块。
SAX解析器是基于事件处理的,需要从头到尾把XML文档扫描一遍,在扫描的过程中,每次遇到一个语法结构时,就会调用这个特定语法结构的事件处理程序,向应用程序发送一个事件。
XmlReader的类提供了一种非常迅速、只向前的只读光标来处理XML数据,因为它是一个流模型,所以内存要求不是很高。但是,没有提供DOM模型的导航功能和读写功能。
DOM是文档对象模型解析,构建文档的分层语法结构,在内存中建立DOM树,DOM树的节点以对象的形式来标识,文档解析文成以后,文档的整个DOM树都会放在内存中。
DOM的优点:
1. 当文档的某一部分被多次的访问时,非常方便。
2. 需要对文档进行调整或者一次性的访问整个文档。
3. DOM可以随时的访问文档中的某个部分。
4. 可以避免一些无效的操作。
DOM结构是完全的存储在内存中的,所以如果文档较大的话,会占用大量的内存。
因为XML文档没有大小限制,所以一些文档可能就不能以DOM方式读取,SAX解析则不存在这样的问题,并且SAX解析的速度要比DOM快一些。
在C#中,使用SAX进行解析的类为XmlReader, XmlWriter,这两个类都是抽象类,由具体的子类来实现相关的功能。
具体的子类: XmlTextReader, XmlTextWriter
接下来简单看看XmlReader的成员:
CloseWhen overridden in a derived class, changes the ReadState to Create(Stream)Creates a new XmlReader instance using the specified stream.Create(String)Creates a new XmlReader instance with specified URI.Create(TextReader)Creates a new XmlReader instance with the specified TextReader.Create(Stream, XmlReaderSettings)Creates a new XmlReader instance with the specified stream and XmlReaderSettings object.Create(String, XmlReaderSettings)Creates a new instance with the specified URI and XmlReaderSettings.Create(TextReader, XmlReaderSettings)Creates a new XmlReader instance using the specified TextReader and XmlReaderSettings objects.Create(XmlReader, XmlReaderSettings)Creates a new XmlReader instance with the specified XmlReader and XmlReaderSettings objects.Create(Stream, XmlReaderSettings, String)Creates a new XmlReader instance using the specified stream, base URI, and XmlReaderSettings object.Create(Stream, XmlReaderSettings, XmlParserContext)Creates a new XmlReader instance using the specified stream, XmlReaderSettings, and XmlParserContext objects.Create(String, XmlReaderSettings, XmlParserContext)Creates a new XmlReader instance using the specified URI, XmlReaderSettings, and XmlParserContext objects.Create(TextReader, XmlReaderSettings, String)Creates a new XmlReader instance using the specified TextReader, XmlReaderSettings, and base URI.Create(TextReader, XmlReaderSettings, XmlParserContext)Creates a new XmlReader instance using the specified TextReader, XmlReaderSettings, and XmlParserContext objects.DisposeReleases the unmanaged resources used by the XmlReader and optionally releases the managed resources.Equals(Object)Determines whether the specified Object is equal to the current Object. (Inherited from Object.)FinalizeAllows an Object to attempt to free resources and perform other cleanup operations before theObject is reclaimed by garbage collection. (Inherited from Object.)GetAttribute(Int32)When overridden in a derived class, gets the value of the attribute with the specified index.GetAttribute(String)When overridden in a derived class, gets the value of the attribute with the specifiedName.GetAttribute(String, String)When overridden in a derived class, gets the value of the attribute with the specifiedLocalName and NamespaceURI.GetHashCodeServes as a hash function for a particular type. (Inherited from Object.)GetTypeGets the Type of the current instance. (Inherited from Object.)IsNameGets a value indicating whether the string argument is a valid XML name.IsNameTokenGets a value indicating whether or not the string argument is a valid XML name token.IsStartElement()Calls MoveToContent and tests if the current content node is a start tag or empty element tag.IsStartElement(String)Calls MoveToContent and tests if the current content node is a start tag or empty element tag and if theName property of the element found matches the given argument.IsStartElement(String, String)Calls MoveToContent and tests if the current content node is a start tag or empty element tag and if theLocalName and NamespaceURI properties of the element found match the given strings.LookupNamespaceWhen overridden in a derived class, resolves a namespace prefix in the current element's scope.MemberwiseCloneCreates a shallow copy of the current Object. (Inherited from Object.)MoveToAttribute(Int32)When overridden in a derived class, moves to the attribute with the specified index.MoveToAttribute(String)When overridden in a derived class, moves to the attribute with the specifiedName.MoveToAttribute(String, String)When overridden in a derived class, moves to the attribute with the specifiedLocalName and NamespaceURI.MoveToContentChecks whether the current node is a content (non-white space text, MoveToElementWhen overridden in a derived class, moves to the element that contains the current attribute node.MoveToFirstAttributeWhen overridden in a derived class, moves to the first attribute.MoveToNextAttributeWhen overridden in a derived class, moves to the next attribute.ReadWhen overridden in a derived class, reads the next node from the stream.ReadAttributeValueWhen overridden in a derived class, parses the attribute value into one or moreReadContentAsReads the content as an object of the type specified.ReadContentAsBase64Reads the content and returns the Base64 decoded binary bytes.ReadContentAsBinHexReads the content and returns the ReadContentAsBooleanReads the text content at the current position as a ReadContentAsDateTimeReads the text content at the current position as a DateTime object.ReadContentAsDecimalReads the text content at the current position as a Decimal object.ReadContentAsDoubleReads the text content at the current position as a double-precision floating-point number.ReadContentAsFloatReads the text content at the current position as a single-precision floating point number.ReadContentAsIntReads the text content at the current position as a 32-bit signed integer.ReadContentAsLongReads the text content at the current position as a 64-bit signed integer.ReadContentAsObjectReads the text content at the current position as an Object.ReadContentAsStringReads the text content at the current position as a String object.ReadElementContentAs(Type, IXmlNamespaceResolver)Reads the element content as the requested type.ReadElementContentAs(Type, IXmlNamespaceResolver, String, String)Checks that the specified local name and namespace URI matches that of the current element, then reads the element content as the requested type.ReadElementContentAsBase64Reads the element and decodes the ReadElementContentAsBinHexReads the element and decodes the ReadElementContentAsBoolean()Reads the current element and returns the contents as a Boolean object.ReadElementContentAsBoolean(String, String)Checks that the specified local name and namespace URI matches that of the current element, then reads the current element and returns the contents as aBoolean object.ReadElementContentAsDateTime()Reads the current element and returns the contents as a DateTime object.ReadElementContentAsDateTime(String, String)Checks that the specified local name and namespace URI matches that of the current element, then reads the current element and returns the contents as aDateTime object.ReadElementContentAsDecimal()Reads the current element and returns the contents as a Decimal object.ReadElementContentAsDecimal(String, String)Checks that the specified local name and namespace URI matches that of the current element, then reads the current element and returns the contents as aDecimal object.ReadElementContentAsDouble()Reads the current element and returns the contents as a double-precision floating-point number.ReadElementContentAsDouble(String, String)Checks that the specified local name and namespace URI matches that of the current element, then reads the current element and returns the contents as a double-precision floating-point number.ReadElementContentAsFloat()Reads the current element and returns the contents as single-precision floating-point number.ReadElementContentAsFloat(String, String)Checks that the specified local name and namespace URI matches that of the current element, then reads the current element and returns the contents as a single-precision floating-point number.ReadElementContentAsInt()Reads the current element and returns the contents as a 32-bit signed integer.ReadElementContentAsInt(String, String)Checks that the specified local name and namespace URI matches that of the current element, then reads the current element and returns the contents as a 32-bit signed integer.ReadElementContentAsLong()Reads the current element and returns the contents as a 64-bit signed integer.ReadElementContentAsLong(String, String)Checks that the specified local name and namespace URI matches that of the current element, then reads the current element and returns the contents as a 64-bit signed integer.ReadElementContentAsObject()Reads the current element and returns the contents as an Object.ReadElementContentAsObject(String, String)Checks that the specified local name and namespace URI matches that of the current element, then reads the current element and returns the contents as anObject.ReadElementContentAsString()Reads the current element and returns the contents as a String object.ReadElementContentAsString(String, String)Checks that the specified local name and namespace URI matches that of the current element, then reads the current element and returns the contents as aString object.ReadElementString()Reads a text-only element.ReadElementString(String)Checks that the Name property of the element found matches the given string before reading a text-only element.ReadElementString(String, String)Checks that the LocalName and NamespaceURI properties of the element found matches the given strings before reading a text-only element.ReadEndElementChecks that the current content node is an end tag and advances the reader to the next node.ReadInnerXmlWhen overridden in a derived class, reads all the content, including markup, as a string.ReadOuterXmlWhen overridden in a derived class, reads the content, including markup, representing this node and all its children.ReadStartElement()Checks that the current node is an element and advances the reader to the next node.ReadStartElement(String)Checks that the current content node is an element with the given Name and advances the reader to the next node.ReadStartElement(String, String)Checks that the current content node is an element with the given LocalName and NamespaceURI and advances the reader to the next node.ReadStringWhen overridden in a derived class, reads the contents of an element or text node as a string.ReadSubtreeReturns a new ReadToDescendant(String)Advances the XmlReader to the next descendant element with the specified qualified name.ReadToDescendant(String, String)Advances the XmlReader to the next descendant element with the specified local name and namespace URI.ReadToFollowing(String)Reads until an element with the specified qualified name is found.ReadToFollowing(String, String)Reads until an element with the specified local name and namespace URI is found.ReadToNextSibling(String)Advances the ReadToNextSibling(String, String)Advances the ReadValueChunkReads large streams of text embedded in an XML document.ResolveEntityWhen overridden in a derived class, resolves the entity reference for SkipSkips the children of the current node.ToStringReturns a String that represents the current Object. (Inherited from Object.)TopAttributeCountWhen overridden in a derived class, gets the number of attributes on the current node.BaseURIWhen overridden in a derived class, gets the base URI of the current node.CanReadBinaryContentGets a value indicating whether the XmlReader implements the binary content read methods.CanReadValueChunkGets a value indicating whether the XmlReader implements the ReadValueChunk method.CanResolveEntityGets a value indicating whether this reader can parse and resolve entities.DepthWhen overridden in a derived class, gets the depth of the current node in the XML document.EOFWhen overridden in a derived class, gets a value indicating whether the reader is positioned at the end of the stream.HasAttributesGets a value indicating whether the current node has any attributes.HasValueWhen overridden in a derived class, gets a value indicating whether the current node can have aValue.IsDefaultWhen overridden in a derived class, gets a value indicating whether the current node is an attribute that was generated from the default value defined in the DTD or schema.IsEmptyElementWhen overridden in a derived class, gets a value indicating whether the current node is an empty element (for example,Item[Int32]When overridden in a derived class, gets the value of the attribute with the specified index.Item[String]When overridden in a derived class, gets the value of the attribute with the specifiedName.Item[String, String]When overridden in a derived class, gets the value of the attribute with the specifiedLocalName and NamespaceURI.LocalNameWhen overridden in a derived class, gets the local name of the current node.NameWhen overridden in a derived class, gets the qualified name of the current node.NamespaceURIWhen overridden in a derived class, gets the namespace URI (as defined in the W3C Namespace specification) of the node on which the reader is positioned.NameTableWhen overridden in a derived class, gets the XmlNameTable associated with this implementation.NodeTypeWhen overridden in a derived class, gets the type of the current node.PrefixWhen overridden in a derived class, gets the namespace prefix associated with the current node.QuoteCharWhen overridden in a derived class, gets the quotation mark character used to enclose the value of an attribute node.ReadStateWhen overridden in a derived class, gets the state of the reader.SchemaInfoGets the schema information that has been assigned to the current node as a result of schema validation.SettingsGets the XmlReaderSettings object used to create this XmlReader instance.ValueWhen overridden in a derived class, gets the text value of the current node.ValueTypeGets The Common Language Runtime (CLR) type for the current node.XmlLangWhen overridden in a derived class, gets the current XmlSpaceWhen overridden in a derived class, gets the current可以看到在构造函数中的Read()函数,在左侧的列表中,首先读取<?xml>这个开始元素,然后获得了当前结点两个属性,接下来读入第一行的回车,判断回车的属性个数为0个,接下来是注释,属性0个,然后是注释的换行符,属性0个,接下来的就比较理解了。
上面的程序中还给出了通过XmlWriter写入Xml数据的方式,XmlWriter提供了只向前的、未缓存的方式进行写入。一定要注意写入的要相匹配,写入一个开始元素,一定要在适当的位置写入结束元素。
private void button5_Click(object sender, EventArgs e) { XmlDocument doc = new XmlDocument(); // 去掉注释 XmlReaderSettings settings = new XmlReaderSettings(); settings.IgnoreComments = true; XmlReader reader = XmlReader.Create("book.xml"); doc.Load(reader); // 获取根节点 XmlNode root = doc.DocumentElement; XmlNode tofind = root.SelectSingleNode("//book[@publicationdate='" + textBox1.Text.Trim() + "']"); MessageBox.Show(tofind.Name); }