Method and system of using fixed-length addresses in message routing
Field of the invention
The present invention generally relates to a service access point system composed of at least two servers, which are connected to each other by an intranet, and which routes messages sent by client applications.
Background of the invention
Various client-server services are based on the use of access points, wherein clients are logged into a service and communicate with each other via a set of access points. Each client is an application program and a set of access points composes a logical server. Instant messaging services, such as a chat service and an e-mail service, are some examples of client- server services. A set of access points is connected to an access network that may be a fixed network or a wireless network. The access network is preferably a packet-switched network, though it could also be a circuit- switched network. We mostly use a term "access point" in the text, because it is shorter than the term "service access point".
The Internet is a packet-switched network whose nodes have an Internet peripheral address (IP address). Each IP address consists of four numbers between 1 and 255, and dots separating each number; for example, 193.199.35.5. The first number refers to the topmost network level and the second number refers to the next level, etc. The routers of the Internet locate the correct receiver by its IP address. Since the Internet is a packet-switched network, no circuit is allocated for the connection. Instead, data is transmitted in packets from the sender to the receiver.
Maximum transfer unit (MTU) defines how many bytes an IP packet can at most contain. The said MTU is usually at least 1500 bytes in Internet protocol version 4 (IP4v) and in Internet protocol version 6 (IP6v). Neverthe- less, according to the specifications of IPv4, the minimum of the MTU is 576 bytes in IPv4 packets and according to the specifications of IPv6 the minimum of the MTU is 1280 bytes in IPv6 packets. Thus, the said specifications guarantee that the size of an IP packet is at least 576 bytes in IPv4 transmissions and 1280 bytes in IPv6 transmissions. The small size of a packet ensures that transmission capacity is divided between end-users of a network. Each IP packet includes a header
with the following information: a sender IP address, a receiver IP address, a sender port (application), and a receiver port (application), among other things. Transmission control protocol/Internet protocol (TCP/IP) is a set of protocols that determine how IP packets are transmitted in the Internet. When using TCP/IP the bytes (octets) have a sequence number.
Thus, a receiver node detects if one or more bytes are missing. Then it sends a retransmission request for the missing bytes. In another protocol a receiver may send an acknowledgement as response to receiving bytes. User datagram protocol (UDP) is in some cases alternative to TCP/IP. UDP offers a transmission with a minimum of protocol handling. Therefore retransmission requests or packet acknowledgements are not used in UDP. For the same reason a sender cannot know whether a receiver has received the packets sent.
The world wide web (WWW or web) is an Internet-based, distrib- uted hypermedia information system. The web pages are traditionally represented using hypertext markup language (HTML). HTML and its successor, extensible markup language (XML), are intended for forming structured documents to be interchanged in the web. Structured documents are searched and read through software that is termed a browser. Hypertext transfer protocol (HTTP) determines how structured documents are transferred in the Internet.
As the Internet has become very popular, it has also been brought to mobile and wireless devices. Many of the prior art services presently in use are based on the global system for mobile communications (GSM) stan- dard. General packet radio services (GPRS) and the universal mobile telecommunications system (UMTS) are third-generation mobile communications which will replace second-generation mobile communications, such as GSM. Wireless markup language (WML) is a formal language that allows the text portions of structured documents to be presented via wireless net- works on wireless devices. WML is a part of the wireless application protocol (WAP). WAP is similar to TCP/IP based protocols enabling Internet in wireless devices.
In addition to TCP/IP and UDP packets, short messages are another method to transmit data from a client to an access point. GSM limits the length of short messages to 160 characters. Multimedia messaging service (MMS) is able to deliver larger messages in a reasonable time compared to
short message service (SMS). The fixed limit will be replaced by an ability to transfer much larger text contents, graphics, and audio/video clips.
Thus, the access network may be a fixed network or a wireless network, such as a GSM, GPRS, or UMTS. Or the access network may be a wireless local area network (WLAN). Clients send data via the access network to a set of access points. Data may be located, for example, in a TCP/IP packet, an UDP packet, or in a short message.
When a client is logged into a service and sends data to an access point, the data contains at least one reference address. The said reference address identifies a sender, a receiver, or a service. The reference address may be a fully qualified domain name (FQDN), or it may be an MSISDN number, i.e. a mobile subscriber integrated service digital network number.
Thus, data sent by a client may include one or more reference addresses which are, for example, MSISDN numbers or fully qualified domain names. An e-mail address, such as Alfa@wiral.com, is one example of an FQDN, but there are also other types of domain names.
The Internet consists of thousands of domains. Each domain has a domain name which is mapped to a certain IP-address. Several domain names may be mapped to the same IP-address. For example, domain names www.jypoly.fi and jkolamk.jkol.jypoly.fi are mapped to IP-address 193.199.35.1. Conversely, fully qualified domain names are unique, such as e-mail user names.
Domain names compose a hierarchical domain name system (DNS). The root of a DNS tree is nameless. Top-level domains are under the root: the original three-letter domains are .com, .net, .org, .edu, .int, .mil and .gov, plus two-letter top-level domains for each country. Under the top-level domains there are lower domains connected to the Internet. The Internet includes domain name servers mapping domain names to IP addresses.
Uniform resource locator (URL) is a system uniquely identifying each resource in the Internet, i.e. where each document or file is located.
A URL address consists of a domain name and a search path. For example, the URL address "www.jypoly.fi/internet/jamk.nsf" consists of domain name "www.jypoly.fi" and search path "/internet/jamk.nsf .
A URL request consists of a protocol part and a URL address. For example, in the following URL request the protocol part is "http://" and the URL address is the before-mentioned "www.jypoly.fi/internet/jamk.nsf:
http://www.jypoly.fi/internet/jamk.nsf
URL is related to several protocols defining how data transfer is performed. Hypertext transfer protocol (HTTP), file transfer protocol (FTP), a protocol for email, i.e. Mailto, and WAP are some of these protocols. FIG. 1 shows a client-server system composed of one access point, wherein client Alfa 11 and client Beta 12 are logged in a service. Client Alfa and Beta applications have their own reference addresses related to the service. During the use of the service client Alfa sends data via an access network 13 to client Beta's reference address. An access point 14 receives the packets and transmits them to client Beta. In FIG. 1 client Alfa is located in a mobile phone and client Beta is located in a laptop. In addition to these devices, a client could be located in, for example, a personal digital assistant (PDA), a personal computer, or a network server.
For example, Jabber IM server described in http://www.jabber.org can be used as an access point for instant messaging (IM) services. However, when a system should have high capacity, a set of access points is needed to handle data sent by clients. The access points can be connected to each other by means of an intranet.
As mentioned above, the IP4v and IP6v specifications guarantee that the size of IP packet is at least 576 bytes in IPv4 transmissions and 1280 bytes in IPv6 transmissions. These minimums of the MTUs should be taken into account in the intranet transmissions. However, a reference address, which includes an URL, may be very long. If an IP packet includes a 500 bytes long reference address the payload of the IP-packet may be less than 76 bytes in IP4v and less than 780 bytes in IP6v. If two reference addresses are included in an IP packet, the payload is even smaller. Anyway, various- length reference addresses make the handling of any type of packets inefficient, including the application level routing of the packets.
The drawback of the prior art systems is that an intranet connect- ing servers may become the bottleneck of a system, thus limiting the capacity of the system. Also a relative high capacity network may be blocked during high load. One reason for blocking is that packets to be transmitted in an intranet may include even 500 bytes long reference addresses. The drawback concerns among other things instant messaging services.
Summary of the invention
The invention relates to a system comprising at least two servers that are connected/coupled to each other by an intranet. These two servers may be both access points, or one of them may an access point and the other one a gateway. An intranet may be e.g. 100 Mbps Ethernet.
The objective of the invention is to increase the communication capacity between a set of servers being coupled, directly or indirectly to an intranet, wherein at least one of the said servers is an access point. This objective is achieved by increasing the payload of packets to be transmitted in the intranet. The packets are preferably IP-packets whose size are maximum transfer unit (MTU). Data sent by a client may be located in bigger packets/messages in an access network. In that case, the packets/messages are fragmented into IP-packets of the size of the MTU to transmit the IP- packets efficiently in an intranet, especially when the intranet is an Ethernet network. The maximum packet size of Ethernet is 1512 bytes.
Data sent by a client contains at least one reference address. The reference address could be e.g. the receiver's e-mail address, such as beta@wiral.com. To increase the payload, the reference address is inputted as a parameter to a hash function that generates a 16 bytes long fixed-length address. The fixed-length address replaces the reference address in the data part of the IP-packet that is transmitted in the intranet.
Thus, the payload of an IP-packet can be remarkably increased by replacing a reference address with a relative short fixed-length address. In addition, a fixed-length address is more efficient to handle in a receiving access point than a various-length reference address. For example, the hash function named Digest generates statistically unique enough, 16 bytes long, fixed-length addresses from 1-500 bytes long reference addresses.
When a client logs into a system, an access point generates a fixed-length address for the client and stores the fixed-length address gener- ated in a memory. When the client sends data to another client, the access point generates a fixed-length address using a reference address of the data as a parameter. Then the access point locates the fixed-length address and a payload in an IP-packet, and sends the IP-packet to an intranet. Another access point, or a gateway belonging to the system, receives the IP-packet, obtains the fixed-length address from it, and searches for the address from the memory. If the address is found, the data is transmitted to the certain
client to which the fixed-length address was related when the certain client logged into the system.
Another objective of the invention is to implement a system which can be termed an application level message router and which is composed of commodity off the self (COTS) hardware. The application level message router can be utilized, for example, in instant messaging (IM) services, wherein messages are routed inside the system and/or between systems.
Brief description of the drawings The invention is described more closely with reference to the accompanying drawings, in which
Figure 1 shows an access point communicating with clients, Figure 2 shows a system composed of two access points Figure 3 shows a system composed of an access point and a gateway,
Figure 4 depicts a hash function generating a reference address to a fixed- length address, Figure 5A shows a packet to be transmitted in an intranet, Figure 5B shows a packet to be transmitted in an Ethernet network, Figure 6 shows a system containing three access points, a load balancer, and a gateway,
Detailed description of the invention
An inventive access point system comprises at least two access points connected/coupled to each other by an intranet. Alternatively, the access point system comprises at least one access point and a gateway connected/coupled to each other by an intranet.
FIG. 2 shows an access point system composed of two access points connected to each other by means of an intranet. Client Alfa 21 and client Beta 22 are logged into a service that is provided by the domain of Wiral Ltd. Client Alfa has an e-mail address alfa@wiral.com and client Beta has an e-mail address beta@wiral.com. During the use of the service client Alfa sends data including a reference address via an access network 23 to client Beta's e-mail address. We may suppose that client Alfa is logged into a first access point 24 and client Beta is logged into a second access point 25. The first access point receives data sent by client Alfa, wherein the data
includes a reference address and a payload. Then the first access point generates a fixed-length address by applying a hash function to the reference address, locates the fixed-length address, i.e. the result of the hash function, and the payload in a packet, and sends the packet in the intranet 26. The second access point receives the packet sent by the first access point and obtains the fixed-length address and the payload from the packet. The second access point searches the fixed-length address from the memory, and if found, transmits the payload via the access network to the second client.
In the above "payload" refers to load data from which is omitted the reference address or reference addresses. Instead of the packet with reference address, the packet with fixed-length address is transmitted from the first access point to the second access point. When client Beta connected to the second access point the following operations were performed. 1) The second access point received the same reference address from client Beta as client Alfa located in its data. In this case, the reference address is client Beta's e-mail address. 2) After that, the second access point generated a fixed-length address by applying a hash function to the reference address and located the fixed-length address and data related to the connection in a memory. Thus, when the second access point receives a packet from the intranet, it searches for a fixed-length address included in the packet from the memory. If the said address is found, the second access point sends the content of the packet to client Beta's IP address. Otherwise, the packet is discarded. In more detail, fixed-length addresses are located in a memory in data structure intended for search operations. Search means are discussed in patent application PCTxxxxxxxxxx.
It is also possible that clients Alfa and Beta are connected to the same access point. Then the said access point can send client Alfa's data directly to client Beta without using the intranet.
FIG. 3 shows an access point system composed of one access point and a gateway connected to each other by means of an intranet. Client Alfa 31 and client Beta 32 are logged into a service that is provided by at least two different domains. Client Alfa has an e-mail address alfa@wiral.com and client Beta has an e-mail address beta@notwiral.com. Client Alfa sends data including via an access network 33 to client Beta's e-mail address. In
this case, Client Beta's email address is a reference address. The access point 34 receives the message data packet including the reference address and the payload sent by client Alfa. Then the access point generates a fixed- length address by inputting the reference address to a hash function, locates the fixed-length address and the payload in a packet, and sends the packet in the intranet 36. The gateway 35 receives the packet sent by the access point and obtains the fixed-length address and the payload from the packet. The gateway searches the fixed-length address from the memory, and if not found, transmits the payload sent by the client to another access point 37. The other access point on domain notwiral.com transmits the data to client Beta.
An objective of the inventive method is to increase the communication capacity between a set of servers coupled, directly or indirectly to an intranet. At least one of said servers is an access point which is adapted to communicate with clients. The access point may or may not be connected to an access network, such as the Internet. The method comprises the steps of receiving load data including a possible long reference address from a client in an access point, and generating a relatively short fixed-length address by applying a hash function to the reference address. The fixed-length address and the payload are then located in a packet; and the packet is sent to the set of servers using the intranet. The method reduces data volume between the servers, and thus increases the communication capacity.
Another objective of the invention is to implement an application level message router that is composed of commodity off a self (COTS) hard- ware. The prior art solutions, such as Jabber, include a separate router server. The systems shown in FIG. 2 and 3 can be considered as application level message routers. A separate router server is useless, because access points and a gateway are able to a route messages. A system of a message router includes at least two servers that are connected together with at least one private network.
FIG. 4 depicts a hash function and its input and output. The hash function 42 obtains a relatively long reference address 41 of variable length as a parameter and outputs a relative short fixed-length address 43. The reference address 41 may be 3-511 bytes long, when the fixed-length ad- dress 43 is preferably 16 bytes long. The hash function 42 is preferably Digest hash function, or to be more specific, the MD5 message-Digest algo-
rithm described in RFC1321 published by the Internet engineering task force (IETF).
FIG. 5A shows a packet to be transmitted in an intranet. The packet 501 includes a header 502, a fixed-length address 503 and a payload 504. Load data is 505 composed of the payload 504 and the fixed-length address 503. The header is fixed-length and includes network- and protocol- specific information and a size of load data.
The intranet is preferably an Ethernet network and packets to be transmitted in the intranet are preferably IP packets. However, the invention is not limited to Ethernet or Internet protocol.
FIG. 5B shows a packet to be transmitted in an Ethernet network. In more detail, an intranet is an Ethernet using Internet protocol. The packet 505 includes an Ethernet header 506, an IP-header 507, a data part header 508, and a payload 509. In this example, the data part header is composed of the following information: a version 510 defining the structure of the data part header, a fixed-length address 511 , a second fixed-length address 512, and data length 513 defining a size of a payload. The fixed-length address is generated by inputting a user name and domain name to a hash function and the second fixed-length address is generated by inputting a domain name to a hash function.
The fixed-length address is preferably intended for the use of an access point, and the second fixed-length address is preferably intended for the use of a gateway. When an access point receives a packet from an intranet, it compares the fixed-length address with fixed length addresses stored in its memory. If the fixed-length address matches with some of them, the access point handles the packet as described in FIG. 2 or 3. When the node of the intranet is a gateway, the node/gateway compares, not the first, but the second fixed-length address with a fixed-length address stored in its memory, wherein the said fixed-length address is the domain name of the access point system. If they don't match, the node/gateway transmits the packet to another access point system.
Usually, when a packet includes two fixed-length addresses, the fixed-length address is intended for routing the packet inside an access point system and the second fixed-length address that is intended for routing the packet from the access point system to another system.
When an access point or a gateway receives a packet from the intranet, as described in FIG. 2 and 3, the access point or the gateway searches for the fixed-length address from the memory.
A packet to be transmitted in an intranet may include one, two, or more fixed-length addresses. If the packet includes at least two fixed-length addresses, one of the addresses may cause a predefined operation in a receiving access point or a gateway. The predefined operation may be, for example, the comparing of domain names as described above.
We may suppose that usually a fixed length address is generated from just one reference address identifying a sender, a receiver, a service, or content, such as a WWW page. However, in some services it might be preferable that some predefined combination of references addresses is inputted as a parameter to a hash function, for example, a character string composed of a sender and a receiver. The combination single reference addresses may result in a longer reference address from which it is easier to generate a unique fixed-length address than from a single reference address.
FIG. 6 shows a system that contains three access points, a load balancer, and a gateway. Client Alfa 61 communicates with client Beta 62 bi- directly via the said system. Either of the said clients has started the commu- nication by sending data via an access network 63 to a load balancer 64. The load balancer takes care that the three access points, 65, 66, and 67 are uniformly loaded. The three access points and a gateway 68 are connected to an intranet 69. The load balancer may operate, for example, as follows: It receives an IP-packet from a client and obtains a sender related reference address from the packet. Then the load balancer inputs the said reference address to a hash function that outputs a fixed-length address. The load balancer modulates the fixed-length address by three resulting in one of the numbers 0, 1 , or 2. The numbers 0, 1 , and 2 are related to the access points 65, 66, and 67 as shown in FIG. 6. If a sender application is located in a mobile device it is important that the load balancer uses a sender related reference address, because the sender's IP address may change during the communication. If the access network is a fixed network, the load balancer may use the sender's IP address in its modulo function. In FIG. 6 the network 69 could connect the access points and the gateway to the load balancer. Alternatively, another private network can be
used for transmitting data from the load balancer to the access points. It is also possible that the load balancer and the access points communicate via the access network 63, or via another public network.
In addition to elements shown in FIG. 2, 3 and 6, the system may also include a login and authentication server. The said server authenticates users and permits logins to the system. Alternatively, the access points may have a shared authentication database, whereby they can authenticate users and permit logins.
An access point system composed of COTS hardware is cost- effective and scalable. A new server can be easily added to the access point system to upgrade the capacity of the system or for another reason.
When an access point system is intended for providing instant messaging services, the system includes at least an access network and an intranet. The system preferably uses 1) Internet protocol and unicast delivery method in the access network and 2) Internet protocol and broadcast, multicast, or any cast delivery method in the intranet that is preferably a type of Ethernet. Unicast delivery method is intended for one-to-one connections, the other delivery methods being intended for one-to-many connections. Broadcast and multicast delivery methods are usable in IP4v; multicast and any cast delivery method are usable in IP6v.
The use of broadcast, multicast, or any cast method enables that an access point system is composed of at least two servers, so that one server is an access point and another server is either an access point or a gateway. In order to make a system more reliable one or more access point may operate as a spare unit, so that the spare unit is put into use if an access point collapses, or access points are overloaded.
The connections between clients may or may not be bi-directional. It is possible that only one client sends data and another client just receives data. It is also possible that a client sends data to many clients, for example, in short messages.
The devices in which client applications are located may or may not be mobile devices. The devices may also be personal computers.
The method and system described above can be utilized as a plat- form of instant messaging services. In addition, both of them can be utilized in message routing, wherein at least a start message should include at least
one reference address. The start message begins a connection between clients. Messages to be routed may be of any message type that is possible to transmit in an access network.