time to leave

time to go for the rest of peace,to stay balance of your life and jobs

one more thing, nothing worth your committment all except your family.

nagios:using check_http to monitor the url

here is a sample  to monitor the specific http url address in nagios conf file.

you can use the direct cmd to test if it works under shell

[root@cent libexec]# ./check_http -u /accounts/login/ -H 121.xx.10.xx
HTTP OK: HTTP/1.1 200 OK - 9555 bytes in 0.055 second response time |time=0.054723s;;;0.000000 size=9555B;;;0

that means that your cmd works well and detect the http service and feedback 200 ok.

edit the cfg file to add a config section

define service{
use                                     local-service
host_name                               seafile
service_description                     web http service for seafile portal login page
check_command check_http! -H 12x.xx.160.xx -u /accounts/login/
notifications_enabled                   1
}

 

[FW]New MongoDB Connector for Apache Spark Enables New Fare Calculation Engine, Supporting 180m Fares and 1.6 billion Queries per Day, Migrated off Oracle.

source :  https://www.mongodb.com/blog/post/mongodb-and-apache-spark-at-china-eastern-airlines

 

As one of the world’s largest airlines, China Eastern constantly explores emerging technologies to identify new ways of improving customer experience and reducing cost.

Earlier this year, I interviewed its engineering leaders to learn more about the migration of China Eastern’s flight search platform from Oracle to MongoDB. As a result of the migration, China Eastern achieved orders-of-magnitude performance improvements, and enabled the delivery of application features that had never before been possible.

The next stage of China Eastern’s initiative to transform user experience has been to address airline fare calculations, and serve them reliably, at scale, to the migrated search application.

At this year’s MongoDB World user conference, Chong Huang, Lead Architect at China Eastern Airlines, presented details on their new fare calculation engine built with Apache Spark and MongoDB to New MongoDB Connector for Apache Spark Enables New Fare Calculation Engine, Supporting 180m Fares and 1.6 billion Queries per Day, Migrated off Oracle support more than 1.6 billion queries per day.

The Challenge

Based on averages used across the airline industry, 12,000 searches are needed to generate a single ticket sale. China Eastern currently sells 260,000 airline seats every day, and is targeting 50% of those to come from its online web and online channels. Selling 130,000 seats equates to 1.6 billion searches and fare requests per day, or just under 20,000 searches a second. However, the current fare calculation system built on the Oracle database supports only 200 searches per second. China Eastern’s engineers realized they needed to radically re-architect their fare engine to meet the required 100x growth in search traffic.

The Solution

Rather than calculate fares for each search in real time, China Eastern decided to take a different approach. In the new system, they pre-compute fares every night, and then load those results to the database for access by the search application.

The flight inventory data is loaded into an Apache Spark cluster, which calculates fares by applying business rules stored in MongoDB. These rules compute fares across multiple permutations, including cabin class, route, direct or connecting flight, date, passenger status, and more. In total, 180 million prices are calculated every night! The results are then loaded into MongoDB where they are accessed by the search application.

Why Apache Spark?

The fare engine demands complex computations against large volumes of flight inventory data. Apache Spark provides an in-memory data processing engine that can be distributed and parallelized across multiple nodes to accelerate processing. Testing confirmed linear scalability as CPU cores were added to the cluster. As a Java shop, Apache Spark’s Java API also provides development simplicity for the engineering team.

Why MongoDB?

China Eastern had already implemented a successful project with MongoDB. They knew the database would scale to meet the throughput and latency needs of the fare engine, and they could rely on expert support from MongoDB engineers. The database’s flexible data model would allow multiple fares to be stored for each route in a single document, accessed in one round trip to the database, rather than having to JOIN many tables to serve fares from their Oracle database.

Figure 1: Fare Engine Architecture

Why the MongoDB Connector for Apache Spark?

Just as the project was starting, MongoDB announced the new Databricks-certifiedMongoDB Connector for Apache Spark.

Using the Connector, Spark can directly read data from the MongoDB collection and turn it into a Spark Resilient Distributed Dataset (RDD), against which transformations and actions can be performed. Internally the Connector uses the splitVector command to create chunk splits, and each chunk can then be assigned to one Spark worker for processing. Data locality awareness in the Connector ensures RDDs are co-located with the associated MongoDB shard, thereby minimizing data movement across the network and reducing latency. Once the transformations and actions have been completed on the RDD, the results can be written back to MongoDB.

The Implementation

Based on performance benchmarking, a cluster of less than 20 Red Hat Linux servers (comprising app servers, and the Spark and MongoDB cluster) are required to meet the demands of 180 million fares and 1.6 billion daily searches. China Eastern is using Apache Spark 1.6 against the latest MongoDB 3.2 release, provisioned with Ops Manager for operational automation. Testing has shown each node in the cluster scaling linearly, and delivering 15x higher performance at 10x lower latency than the previous Oracle based system.

Next Steps

In his MongoDB World presentation, Mr. Huang provides more insight into the platform built with Apache Spark and MongoDB. He shares detailed steps and code samples that show how to download and setup the Spark cluster, how to configure the MongoDB Connector for Apache Spark, the process for submitting a job, and lessons learned along the way to optimize performance.

nagios mail notify

There are lots of services monitor in your nagios,what if the service is down?
The most important issue is that the admin should be notified by the first time,email is great tool on it.

1.contact.cfg

edit the contact.cfg file to tell the nagios who is the admin to receieve all the mail

# template which is defined elsewhere.

define contact{
        contact_name                    nagiosadmin             ; Short name of user
        use                             generic-contact         ; Inherit default values from generic-contact template (de
fined above)
        alias                           Nagios Admin            ; Full name of user

        email                           admin@gmail.com  ; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ******
        }

2.commands.cfg

nothing need to be changed in the commands.cfg,the sample is marked below:

# 'notify-host-by-email' command definition
define command{
        command_name    notify-host-by-email
        command_line    /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME
$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /bin/mail -s "** $NOTI
FICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$
        }

# 'notify-service-by-email' command definition
define command{
        command_name    notify-service-by-email
        command_line    /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SER
VICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Inf
o:\n\n$SERVICEOUTPUT$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ 
**" $CONTACTEMAIL$
        }

3.using the sendmail in linux to sending mail

vi the /etc/mail.rc which is the config file of sendmail

# add by admin

set from=sender@gmail.com smtp=smtp.gmail.com
set smtp-auth-user=sender@gmail.com smtp-auth-password=pwd_email_account smtp-auth=login

4.windows.cfg
enable the function of mail notification by add option “notifications_enabled 1”

# Change the host_name to match the name of the host you defined above

define service{
        use                     generic-service
        host_name               COS360,PRINTER_SRV_BJ,MENJIN,VM_BJ
        service_description     C:\ Drive Space
        check_command           check_nt!USEDDISKSPACE!-l c -w 80 -c 90
        notifications_enabled   1
        }

//add  notifications_enabled in services will open the notification

5.tesing

bring down one of the service,you will get the email notification

–EOF–

Better high availability: MySQL and Percona XtraDB Cluster with good application design

Source:
https://www.percona.com/blog/2015/12/21/mysql-and-percona-xtradb-cluster-even-higher-availability-with-correct-application-design/

High Availability

Have you ever wondered if your application should be able to work in read-only mode? How important is that question?

MySQL seems to be the most popular database solution for web-based products. Most typical Internet application workloads consist of many reads, with usually few writes. There are exceptions of course – MMO games for instance – but often the number of reads is much bigger then writes. So when your database infrastructure looses its ability to accept writes, either because traditional MySQL replication topology lost its master or Galera cluster lost its quorum, why would you want to the application to declare total downtime? During this scenario, imagine all the users who are just browsing the application (not contributing content): they don’t care if the database cannot accept new data. People also prefer to have access to an application, even if it’s functionality is notably reduced, rather then see the 500 error pages. In some disaster scenarios it is a seriously time-consuming task to perform PITR or recover some valuable data: it is better to at least have the possibility of user read access to a recent backup.

My advice: design your application with the possible read-only partial outage in mind, and test how the application works in that mode during it’s development life cycle. I think it will pay off greatly and increase the perception of availability of your product. As an example, check out some of the big open source projects’ implementation of this concept, like MediaWiki or Drupal (and also some commercial products).
PXC

Having said that, I want to highlight a pretty new (and IMHO) important improvement in this regard, introduced in the Galera replication since PXC version 5.6.24. It was already mentioned by my colleague Stéphane in his blog post earlier this year.
Focus on data consistency

As you probably know, one of Galera’s key advantages is their great data consistency care and data-centric approach. No matter where you write in the cluster, all nodes must have the same data. This is important when you realize what happens when a data inconsistency is detected between the nodes. Inconsistent nodes, which cannot apply a writeset due to missing rows or duplicate unique key values for instance, will have to abort and perform an emergency shutdown. This happens in order to remove contaminated members from the cluster, and not spread the data “illness” further. If it does happen that the majority of nodes perform an emergency abort, the remaining minority may loose the cluster quorum and will stop serving further client’s requests. So the price for data consistency protection is availability.
Anti-split-brain

Sometimes a node or node cluster members loose connectivity to others, in a way that >50% of nodes can no longer communicate. Connectivity is lost all of a sudden, without a proper “goodbye” message from the “dissappeared” nodes. These nodes don’t know what the reason was for the lost connection – were the peers killed? or may be networks were split? In that situation, nodes declare a non-Primary cluster state and go into SQL-disabled mode. This is because a member of a cluster without a quorum (majority), hence not acting as Primary Component, is not trusted as it may have inconsistent or old data. Because of this state, it won’t allow the clients to access it.

This is for two reasons. First and unquestionably, we don’t want to allow writes when there is a risk of network split, where the other part of the cluster still forms the Primary Component and keeps operating. We may also want to disallow reads of a stall data, however, when there is a possibility that the other part of the infrastructure already has a lot of newer information.

In the standard MySQL replication process there are no such precautions – if replication is broken in master-master topology both masters can still accept writes, and they can also read anything from the slaves regardless of how much they may be lagging or if they are connected to their masters at all. In Galera though, even if too much lag in the applying queue is detected (similar to replication lag concept), the cluster will pause writes using a Flow Control mechanism. If replication is broken as described above, it will even stop the reads.
Dirty reads from Galera cluster

This behavior may seem too strict, especially if you just migrated from MySQL replication to PXC, and you just accept that database “slave” nodes can serve the read traffic even if they are separated from the “master.” Or if your application does not rely on writes but mostly on access to existing content. In that case, you can either enable the new wsrep_dirty_reads variable dynamically (per session only if needed), or setup your cluster to run this option by default by placing wsrep_dirty_reads = ON in the my.cnf (global values are acceptable in the config file is available since PXC 5.6.26).

The Galera topology below is something we very often see at customer sites, where WAN locations are configured to communicate via VPN:

WAN_PXC

WAN_PXCI think this failure scenario is a perfect usage case for wsrep_dirty_reads – where none of the cluster parts are able to work at full functionality alone, but could successfully keep serving read queries to the clients.

So let’s quickly see how the cluster member behaves with the wsrep_dirty_reads option disabled and enabled (for the test I blocked network communication on port 4567):
MySQL

percona3 mysql> show status like 'wsrep_cluster_status';
+----------------------+-------------+
| Variable_name        | Value       |
+----------------------+-------------+
| wsrep_cluster_status | non-Primary |
+----------------------+-------------+
1 row in set (0.00 sec)
percona3 mysql> show variables like '%dirty_reads';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| wsrep_dirty_reads | OFF   |
+-------------------+-------+
1 row in set (0.01 sec)
percona3 mysql>  select * from test.g1;
ERROR 1047 (08S01): WSREP has not yet prepared node for application use


	
percona3 mysql> show status like 'wsrep_cluster_status';
+----------------------+-------------+
| Variable_name        | Value       |
+----------------------+-------------+
| wsrep_cluster_status | non-Primary |
+----------------------+-------------+
1 row in set (0.00 sec)
 
percona3 mysql> show variables like '%dirty_reads';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| wsrep_dirty_reads | OFF   |
+-------------------+-------+
1 row in set (0.01 sec)
 
percona3 mysql>  select * from test.g1;
ERROR 1047 (08S01): WSREP has not yet prepared node for application use

And when enabled:
MySQL
percona2 mysql> show status like 'wsrep_cluster_status';
+----------------------+-------------+
| Variable_name        | Value       |
+----------------------+-------------+
| wsrep_cluster_status | non-Primary |
+----------------------+-------------+
1 row in set (0.00 sec)
percona2 mysql> show variables like '%dirty_reads';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| wsrep_dirty_reads | ON    |
+-------------------+-------+
1 row in set (0.00 sec)
percona2 mysql> select * from test.g1;
+----+-------+
| id | a     |
+----+-------+
|  1 | dasda |
|  2 | dasda |
+----+-------+
2 rows in set (0.00 sec)
percona2 mysql> insert into test.g1 set a="bb";
ERROR 1047 (08S01): WSREP has not yet prepared node for application use
percona2 mysql> show status like 'wsrep_cluster_status';
+----------------------+-------------+
| Variable_name        | Value       |
+----------------------+-------------+
| wsrep_cluster_status | non-Primary |
+----------------------+-------------+
1 row in set (0.00 sec)
 
percona2 mysql> show variables like '%dirty_reads';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| wsrep_dirty_reads | ON    |
+-------------------+-------+
1 row in set (0.00 sec)
 
percona2 mysql> select * from test.g1;
+----+-------+
| id | a     |
+----+-------+
|  1 | dasda |
|  2 | dasda |
+----+-------+
2 rows in set (0.00 sec)
 
percona2 mysql> insert into test.g1 set a="bb";
ERROR 1047 (08S01): WSREP has not yet prepared node for application use

MySQL

In traditional replication, you are probably using the slaves for reads anyway. So if the master crashes, and for some reason a failover toolkit like MHA or PRM is not configured or also fails, in order to keep the application working you should direct new connections meant for the master to one of the slaves. If you use a loadbalancer, maybe just have the slaves as backups for the master in the write pool. This may help to achieve a better user experience during the downtime, where everyone can at least use existing information. As noted above, however, the application must be prepared to work that way.

There are caveats to this implementation, as the “read_only” mode that is usually used on slaves is not 100% read-only. This is due to the exception for “super” users. In this case, the new super_read_only variable comes to the rescue (available in Percona Server 5.6) as well as stock MySQL 5.7. With this feature, there is no risk that after pointing database connections to one of the slaves, some special users will change the data.

If a disaster is severe enough, it may be necessary to recover data from a huge SQL dump, and it’s often hard to find enough spare servers to serve the traffic with an old binary snapshot. It’s worth noting that InnoDB has a special read-only mode, meant to be used in a read-only medium, that is lightweight compared to full InnoDB mode.

–EOF–

nagios

you can use nagios to monitor your infrastructure of basic IT resources.

you should first define a host to be monitored then use the define service to do the action,it’s the OO thinking mind.

taking the NT platform as a example,install the NSClient++ plugin as agent to the hosts you want to monitor.

define host{
        use             windows-server  ; Inherit default values from a template
        host_name       host1  ; The name we're giving to this host
        alias           COS Security Server     ; A longer name associated with the host
        address         172.16.60.134   ; IP address of the host
        }


define service{
        use                     generic-service
        host_name               host1
        service_description     NSClient++ Version
        check_command           check_nt!CLIENTVERSION
        }

111

check_command is the bin action of monitoring the resource,there are lots of monitor action in $nagios/bin and you add

–EOF–

wordpress post to pdf issue

post to pdf issue,you need the post2pdf plugin and you need to solve the mbstring and encoding issue of php.

work as follow:
http://www.knowledgebase-script.com/kb/article/how-to-enable-mbstring-in-php-46.html
//
Below is a sample excerpt php.ini file which contains the configuration of mbstring variables.
[mbstring]
mbstring.language = all
mbstring.internal_encoding = UTF-8
mbstring.http_input = auto
mbstring.http_output = UTF-8
mbstring.encoding_translation = On
mbstring.detect_order = UTF-8
mbstring.substitute_character = none;
mbstring.func_overload = 0
mbstring.strict_encoding = Off
– See more at: http://www.knowledgebase-script.com/kb/article/how-to-enable-mbstring-in-php-46.html#sthash.sjfHxMHP.dpuf
//

configure the php with –enable-mbstring=all option

..

–EOF–

oracle password file issue

There is a online help from a old fellow which it’s oracle RAC env meet the error root in corrupt of orapwd file.This kind of issue can be fixed by following guide:

source from:http://www.adp-gmbh.ch/ora/admin/password_file.html

Oracle’s password file

If the DBA wants to start up an Oracle instance there must be a way for Oracle to authenticate this DBA. That is if (s)he is allowed to do so. Obviously, his password can not be stored in the database, because Oracle can not access the database before the instance is started up. Therefore, the authentication of the DBA must happen outside of the database. There are two distinct mechanisms to authenticate the DBA: using the password file or through the operating system.
The init parameter remote_login_passwordfile specifies if a password file is used to authenticate the DBA or not. If it set either to shared or exclusive a password file will be used.
Default location and file name
The default location for the password file is: $ORACLE_HOME/dbs/orapw$ORACLE_SID on Unix and %ORACLE_HOME%\database\PWD%ORACLE_SID%.ora on Windows.
Deleting a password file
If password file authentication is no longer needed, the password file can be deleted and the init parameter remote_login_passwordfile set to none.
Password file state
If a password file is shared or exclusive is also stored in the password file. After its creation, the state is shared. The state can be changed by setting remote_login_passwordfile and starting the database. That is, the database overwrites the state in the password file when it is started up.
A password file whose state is shared can only contain SYS.
Creating a password file
Password files are created with the orapwd tool.
Adding Users to the password file
Users are added to the password file when they’re granted the SYSDBA or sysoper privilege.

SYS@ora10> show user;
USER is "SYS"
SYS@ora10> select * from v$pwfile_users;

USERNAME                       SYSDB SYSOP
------------------------------ ----- -----
SYS                            TRUE  TRUE

SYS@ora10> grant SYSDBA to rene;

Grant succeeded.

SYS@ora10> select * from v$pwfile_users;

USERNAME                       SYSDB SYSOP
------------------------------ ----- -----
SYS                            TRUE  TRUE
RENE                           TRUE  FALSE

SYS@ora10> grant SYSOPER to rene;

Grant succeeded.

SYS@ora10> select * from v$pwfile_users;

USERNAME                       SYSDB SYSOP
------------------------------ ----- -----
SYS                            TRUE  TRUE
RENE                           TRUE  TRUE

SYS@ora10> revoke SYSDBA from rene;

Revoke succeeded.

SYS@ora10> select * from v$pwfile_users;

USERNAME                       SYSDB SYSOP
------------------------------ ----- -----
SYS                            TRUE  TRUE
RENE                           FALSE TRUE

SYS@ora10> revoke SYSOPER from rene;

Revoke succeeded.

SYS@ora10> select * from v$pwfile_users;

USERNAME                       SYSDB SYSOP
------------------------------ ----- -----
SYS                            TRUE  TRUE

–EOF–

Firewall、DCD、TCP Keep alive

from :http://blogread.cn/it/article/5198?f=wb

在以前我写的一篇文章《Oracle与防火墙》中提到,网络防火墙会切断长时间空闲的TCP连接,这个空闲时间具体多长可以在防火墙内部进行设置。防火墙切断连接之后,会有下面的可能:

 

  • 切断连接之前,连接对应的Oracle会话正在执行一个耗时特别长的SQL,比如存储过程而在此过程中没有任何数据输出到客户端,这样当SQL执行完成之后,向客户端返回结果时,如果TCP连接已经被防火墙中断,这时候显然会出现错误,连接中断,那么会话也就会中断。但是客户端还不知道,会一直处于等待服务器返回结果的状态。

 

从上面的前面2种情况来看,防火墙切断数据库TCP连接,引起的后果就会有: Continue reading “Firewall、DCD、TCP Keep alive”

user local logon fail

I can access my linux from the ssh but failed in local logon with error in /var/log/security as below:

Jun 24 11:19:36 localhost login: pam_unix(login:session): session opened for user root by LOGIN(uid=0)
Jun 24 11:19:36 localhost login: Permission denied

the root reason is the error in /etc/pam.d/login:

[root@VM42 ~]# more /etc/pam.d/login
#%PAM-1.0
auth [user_unknown=ignore success=ok ignore=ignore default=bad] pam_securetty.so
auth       include      system-auth
account    required     pam_nologin.so
account    include      system-auth
password   include      system-auth
# pam_selinux.so close should be the first session rule
session    required     pam_selinux.so close
session    required     pam_loginuid.so
session    optional     pam_console.so
# pam_selinux.so open should only be followed by sessions to be executed in the user context
session    required     pam_selinux.so open
session    required     pam_namespace.so
session    optional     pam_keyinit.so force revoke
session    include      system-auth
-session   optional     pam_ck_connector.so
# add for oracle install
session    reqiured     pam_limits.so

the spell of reqiured is not correct….Correct it,everything goes write…

-EOF-